Python Scrapy:item._values和item.fields有什么区别?

时间:2016-01-04 14:49:40

标签: python scrapy

(使用Scrapy 1.0.0):我已经声明了一个包含三个字段的项目:

class MyItem(scrapy.Item):
    # define the fields for your item here                                                                 
    correct_ans = scrapy.Field()
    pronunciation = scrapy.Field()
    good_alts = scrapy.Field()

def __init__(self, correct_ans, pronunciation):
    # bad style??                                                                                      
    super(MyItem, self).__init__()
    self._values['correct_ans'] = correct_ans
    self._values['pronunciation'] = pronunciation
    self._values['good_alts'] = []

奇怪的是,在我实例化一个新的MyItem之后,self._values和self.fields有不同的键(参见下面的pdb输出)。问题是......为什么?这是设计吗?

(Pdb) l
 24         def parse(self, response):
 25             for vocab in self.vocabs:
 26                 item = MyItem(vocab['correct_ans'], vocab['pronunciation'])
 27                 pdb.set_trace()
 28  ->             request = FormRequest.from_response(response, formnumber=1, 
 29                                                     formdata = {'word': item['pronunciation']}, 
 30                                                     callback =     self.parse_pinyin_match_page)
 31                 logging.debug("OK:" + str(request))
 32                 request.meta['item'] = item
 33                 yield request
(Pdb) item.fields
{'correct_ans': {}, 'pronunciation': {}}
(Pdb) item._values
{'correct_ans': '\xe9\x99\x88\xe5\xa4\xa7\xe4\xb8\x9c', 'pronunciation': 'chendadong', 'good_alts': []}
(Pdb) 'good_alts' in item.fields
False
(Pdb) 'good_alts' in item._values
True
(Pdb) 

在我尝试使用内置CSV导出器之前,我没有遇到任何问题。现在scrapy退出并出现以下错误:

  File "/Library/Python/2.7/site-packages/scrapy/exporters.py", line 188, in export_item
    values = [x[1] for x in fields]
  File "/Library/Python/2.7/site-packages/scrapy/exporters.py", line 72, in _get_serialized_fields
    field = {} if isinstance(item, dict) else item.fields[field_name]
KeyError: 'good_alts'

1 个答案:

答案 0 :(得分:0)

我找到了答案!

我还定义了一个函数:

def good_alts():
    return self._values['good_alts']

删除此功能可解决问题。所以有某种名称冲突。仍然是一个非常令人困惑的错误,但我应该知道比为函数和属性使用相同的名称更好。