我找到了几种解决方案,它们如何通过重写类OrderItem
class OrderedItem(scrapy.Item):
def __init__(self, *args, **kwargs):
self._values = OrderedDict()
if args or kwargs:
for k, v in six.iteritems(dict(*args, **kwargs)):
self[k] = v
我要提取的数据更多,而且每次顺序都不相同 def repr (自己): 返回json.dumps(OrderedDict(self),sure_ascii = False)
class NewItem(OrderedItem):
title = scrapy.Field()
price = scrapy.Field()
然后在搜寻器脚本中,我定义了NewItem
对象的实例
def parse(self, response):
items = NewItem()
items['title'] = response.xpath(
"//span[@class='pdp-mod-product-badge-title'/text()").extract_first()
items['price'] = response.xpath("//span[contains(@class, 'pdp-price')]/text()").extract_first()
yield items
答案 0 :(得分:1)
您需要在settings.py
中定义订单:
FEED_EXPORT_FIELDS = ["title", "price"]