我正在尝试通过命令行导出我的文件:
scrapy crawl tunisaianet -o save.csv -t csv
但没有任何事情发生,任何帮助?
这是我的代码:
import scrapy
import csv
from tfaw.items import TfawItem
class TunisianetSpider(scrapy.Spider):
name = "tunisianet"
allowed_domains = ["tunisianet.com.tn"]
start_urls = [
'http://www.tunisianet.com.tn/466-consoles-jeux/',
]
def parse(self, response):
item = TfawItem()
data= []
out = open('out.csv', 'a')
x = response.xpath('//*[contains(@class, "ajax_block_product")]')
for i in range(0, len(x)):
item['revendeur'] = response.xpath('//*[contains(@class, "center_block")]/h2/a/@href').re('tunisianet')[i]
item['produit'] = response.xpath('//*[contains(@class, "center_block")]/h2/a/text()').extract()[i]
item['url'] = response.xpath('//*[contains(@class, "center_block")]/h2/a/@href').extract()[i]
item['description'] = response.xpath('//*[contains(@class, "product_desc")]/a/text()').extract()[i]
item['prix'] = response.xpath('//*[contains(@class, "price")]/text()').extract()[i]
data = item['revendeur'], item['produit'], item['url'], item['description'], item['prix']
yield data
out.write(str(data))
out.write('\n')
答案 0 :(得分:1)
我假设您收到了这些错误:
ERROR: Spider must return Request, BaseItem, dict or None, got 'tuple' in <GET http://www.tunisianet.com.tn/466-consoles-jeux>
具体说明了什么是错误的,你将元组作为项目返回,将你的产量代码改为:
...
item['prix'] = response.xpath('//*[contains(@class, "price")]/text()').extract()[i]
yield item