我正在从csv文件中抓取网址,每个网址都有一个名称。我如何下载这些网址并用他们的名字保存?
reader = csv.reader(open("source1.csv"))
for Name,Sources1 in reader:
urls.append(Sources1)
class Spider(scrapy.Spider):
name = "test"
start_urls = urls[1:]
def parse(self, response):
filename = **Name** + '.pdf' //how can I get the names I read from the csv file?
答案 0 :(得分:2)
也许你想覆盖start_requests()方法而不是使用start_urls?
示例:
class MySpider(scrapy.Spider):
name = 'test'
def start_requests(self):
data = read_csv()
for d in data:
yield scrapy.Request(d.url, meta={'name': d.name})
请求的meta
dict将被重新包含在响应中,以便稍后执行:
def parse(self, response):
name = response.meta.get('name')
...