假设我要定义一个名为Product的项目模型,其中包含一个名为@type
的密钥。
class Product(scrapy.Item):
name = scrapy.Field()
price = scrapy.Field()
stock = scrapy.Field()
@type = scrapy.Field()
显然,由于@type
不是有效的实例变量名,因此以下定义在python中是非法的。
仍然有这样的JSON是有效的:
{
name: "Battery",
price: 1.00,
stock: 10,
@type: "Product"
}
有没有人知道如何在Scrapy中正确地做到这一点?
答案 0 :(得分:0)
由于scrapy.Item
基于dict
并在[{3}}中存储字段,因此覆盖__init__()
:
class Product(scrapy.Item):
name = scrapy.Field()
price = scrapy.Field()
stock = scrapy.Field()
def __init__(self, *args, **kwargs):
super(Product, self).__init__(*args, **kwargs)
self.fields['@type'] = Field()
示例蜘蛛:
from scrapy import Item, Field
from scrapy import Spider
class Product(Item):
name = Field()
price = Field()
stock = Field()
def __init__(self, *args, **kwargs):
super(Product, self).__init__(*args, **kwargs)
self.fields['@type'] = Field()
class ProductSpider(Spider):
name = "product_spider"
start_urls = ['http://google.com']
def parse(self, response):
item = Product()
item['name'] = 'Test name'
item['price'] = 0
item['stock'] = True
item['@type'] = 'Test type'
return item
产地:
$ scrapy runspider spider1.py
2014-08-11 14:32:00-0400 [product_spider] DEBUG: Scraped from <200 http://www.google.com/>
{'@type': 'Test type', 'name': 'Test name', 'price': 0, 'stock': True}