我有一个站点地图蜘蛛,可以收集到csv文件的链接。我想使用csv蜘蛛爬过这些链接。我如何将一只蜘蛛的输出量喂给另一只?
答案 0 :(得分:1)
查看official documentation中的示例:
238
要将其与本地文件一起使用,只需使用文件网址:from scrapy.spiders import CSVFeedSpider
from myproject.items import TestItem
class MySpider(CSVFeedSpider):
name = 'example.com'
allowed_domains = ['example.com']
start_urls = ['http://www.example.com/feed.csv']
delimiter = ';'
quotechar = "'"
headers = ['id', 'name', 'description']
def parse_row(self, response, row):
self.logger.info('Hi, this is a row!: %r', row)
item = TestItem()
item['id'] = row['id']
item['name'] = row['name']
item['description'] = row['description']
return item