我是Python和Scrapy的新手,并尝试将已爬网的数据输出到我的MySQL数据库,但我遇到了以下错误;
exceptions.AttributeError: 'list' object has no attribute 'encode'
这是我的管道代码;
import sys
import MySQLdb
import hashlib
from scrapy.exceptions import DropItem
from scrapy.http import Request
class MySQLStorePipeline(object):
def __init__(self):
self.conn = MySQLdb.connect(user='User', passwd='passwd', db='db', host='host', charset="utf8", use_unicode=True)
self.cursor = self.conn.cursor()
def process_item(self, item, spider):
try:
self.cursor.execute("""INSERT INTO Teams (Country, CountryFlagLink, TeamWikiURL, MethodOfQualification, DateOfQualification, FinalsAppearance, LastAppearance, PreviousBestPerformance, FifaRankingAsOfOct2013)
VALUES (%s, %s)""",
(item['Country'].encode('utf-8'),
item['CountryFlagLink'].encode('utf-8'),
item['TeamWikiURL'].encode('utf-8'),
item['MethodOfQualification'].encode('utf-8'),
item['DateOfQualification'].encode('utf-8'),
item['FinalsAppearance'].encode('utf-8'),
item['LastAppearance'].encode('utf-8'),
item['PreviousBestPerformance'].encode('utf-8'),
item['FifaRankingAsOfOct2013'].encode('utf-8')))
self.conn.commit()
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
return item
在我抓取网站并尝试将数据导入MySQL数据库之后,这里是完整的堆栈跟踪;
ls\defer.py", line 65, in process_chain
d.callback(input)
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
80, in callback
self._startRunCallbacks(result)
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
88, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
75, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "wikitut\pipelines.py", line 16, in process_item
(item['Country'].encode('utf-8'),
exceptions.AttributeError: 'list' object has no attribute 'encode'
2013-11-12 19:36:33-0600 [wikitut] ERROR: Error processing {'Country': [u'Ecuad
r'],
'CountryFlagLink': [u'//upload.wikimedia.org/wikipedia/commons/thumb/e
e8/Flag_of_Ecuador.svg/23px-Flag_of_Ecuador.svg.png'],
'DateOfQualification': [u'15 October 2013'],
'FifaRankingAsOfOct2013': [u'22'],
'FinalsAppearance': [u'3rd'],
'LastAppearance': [u'2006'],
'MethodOfQualification': [u'CONMEBOL Round Robin 4th place'],
'PreviousBestPerformance': [u'Round of 16 (2006)'],
'TeamWikiURL': [u'/wiki/Ecuador_national_football_team']}
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\scrapy-0.18.4-py2.7.egg\scrapy\mi
dleware.py", line 62, in _process_chain
return process_chain(self.methods[methodname], obj, *args)
File "C:\Python27\lib\site-packages\scrapy-0.18.4-py2.7.egg\scrapy\ut
ls\defer.py", line 65, in process_chain
d.callback(input)
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
80, in callback
self._startRunCallbacks(result)
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
88, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
75, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "wikitut\pipelines.py", line 16, in process_item
(item['Country'].encode('utf-8'),
exceptions.AttributeError: 'list' object has no attribute 'encode'
2013-11-12 19:36:33-0600 [wikitut] ERROR: Error processing {'Country': [u'Hondu
as'],
'CountryFlagLink': [u'//upload.wikimedia.org/wikipedia/commons/thumb/8
82/Flag_of_Honduras.svg/23px-Flag_of_Honduras.svg.png'],
'DateOfQualification': [u'15 October 2013'],
'FifaRankingAsOfOct2013': [u'34'],
'FinalsAppearance': [u'3rd'],
'LastAppearance': [u'2010'],
'MethodOfQualification': [u'CONCACAF Fourth Round 3rd place'],
'PreviousBestPerformance': [u'Group stage (1982, 2010)'],
'TeamWikiURL': [u'/wiki/Honduras_national_football_team']}
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\scrapy-0.18.4-py2.7.egg\scrapy\mi
dleware.py", line 62, in _process_chain
return process_chain(self.methods[methodname], obj, *args)
File "C:\Python27\lib\site-packages\scrapy-0.18.4-py2.7.egg\scrapy\ut
ls\defer.py", line 65, in process_chain
d.callback(input)
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
80, in callback
self._startRunCallbacks(result)
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
88, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "C:\Python27\lib\site-packages\twisted\internet\defer.py", line
75, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "wikitut\pipelines.py", line 16, in process_item
(item['Country'].encode('utf-8'),
exceptions.AttributeError: 'list' object has no attribute 'encode'
2013-11-12 19:36:33-0600 [wikitut] INFO: Closing spider (finished)
2013-11-12 19:36:33-0600 [wikitut] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 246,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 72797,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2013, 11, 13, 1, 36, 33, 840000),
'log_count/DEBUG': 7,
'log_count/ERROR': 22,
'log_count/INFO': 3,
'response_received_count': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
我有一个包含所有必填字段(所有varchar)的MySQL数据库设置并设置为整理:utf8_general_ci。我迷失了为什么我得到了上面提到的错误。有些人可以向我解释一下我做错了吗?
答案 0 :(得分:2)
根据您的错误消息,它似乎是item['Country']
列表,其中包含1个元素。见Country': [u'Honduas']
所以你需要像这样编辑:
(item['Country'][0].encode('utf-8'),
item['CountryFlagLink'][0].encode('utf-8'),
item['TeamWikiURL'][0].encode('utf-8'),
item['MethodOfQualification'][0].encode('utf-8'),
item['DateOfQualification'][0].encode('utf-8'),
item['FinalsAppearance'][0].encode('utf-8'),
item['LastAppearance'][0].encode('utf-8'),
item['PreviousBestPerformance'][0].encode('utf-8'),
item['FifaRankingAsOfOct2013'][0].encode('utf-8')))
我不是Python用户,所以也许我错了。