如果我在没有最后一行的情况下运行我的代码:getVal(tweet['retweeted_status']['favorite_count']),
那么scrape会起作用,但是当我添加此行时,我收到一条错误消息KeyError: 'retweeted_status'
有谁知道我做错了什么?
q = "David_Cameron"
results = twitter_user_timeline(twitter_api, q)
print len(results)
# Show one sample search result by slicing the list...
# print json.dumps(results[0], indent=1)
csvfile = open(q + '_timeline.csv', 'w')
csvwriter = csv.writer(csvfile)
csvwriter.writerow(['created_at',
'user-screen_name',
'text',
'coordinates lng',
'coordinates lat',
'place',
'user-location',
'user-geo_enabled',
'user-lang',
'user-time_zone',
'user-statuses_count',
'user-followers_count',
'user-created_at'])
for tweet in results:
csvwriter.writerow([tweet['created_at'],
getVal(tweet['user']['screen_name']),
getVal(tweet['text']),
getLng(tweet['coordinates']),
getLat(tweet['coordinates']),
getPlace(tweet['place']),
getVal(tweet['user']['location']),
getVal(tweet['user']['geo_enabled']),
getVal(tweet['user']['lang']),
getVal(tweet['user']['time_zone']),
getVal(tweet['user']['statuses_count']),
getVal(tweet['user']['followers_count']),
getVal(tweet['user']['created_at']),
getVal(tweet['retweeted_status']['favorite_count']),
])
print "done"
答案 0 :(得分:1)
根据https://dev.twitter.com/overview/api/tweets处的API,此属性可能存在也可能不存在。
如果它不存在,您将无法访问该属性。您可以使用in运算符进行安全查找,通过先检查存在来访问它
retweeted_favourite_count = tweet['retweeted_status']['favourite_count'] if 'retweeted_status' in tweet else None
或者假设它在那里但是当它不是
时处理
try:
retweeted_favourite_count = tweet['retweeted_status']['favourite_count']
except KeyError:
retweeted_favourite_count = 0
然后在写行函数中指定retweeted_favourite_count值。
此外,您的CSV标题行缺少转发的收藏计数
的说明更新示例:
for tweet in results:
#Notice this is one long line not two rows.
retweeted_favourite_count = tweet['retweeted_status']['favourite_count'] if 'retweeted_status' in tweet else None
csvwriter.writerow([tweet['created_at'],
getVal(tweet['user']['screen_name']),
getVal(tweet['text']),
getLng(tweet['coordinates']),
getLat(tweet['coordinates']),
getPlace(tweet['place']),
getVal(tweet['user']['location']),
getVal(tweet['user']['geo_enabled']),
getVal(tweet['user']['lang']),
getVal(tweet['user']['time_zone']),
getVal(tweet['user']['statuses_count']),
getVal(tweet['user']['followers_count']),
getVal(tweet['user']['created_at']),
# And insert it here instead
getVal(retweeted_favourite_count),
])
你也可以换行:
getVal(tweet['retweeted_status']['favorite_count'])
正如Padriac Cunningham所建议的那样
getVal(tweet.get('retweeted_status', {}).get('favourite_count', None)
答案 1 :(得分:0)
仅供参考,对于今后看到此内容的任何人...我设法使用以下内容获取代码。 getVal(tweet ['favorite_count'])给出了推文的最爱数量。
q = "SkyNews"
results = twitter_user_timeline(twitter_api, q)
csvfile = open(q + '_timeline.csv', 'w')
csvwriter = csv.writer(csvfile)
csvwriter.writerow(['created_at',
'user-screen_name',
'text',
'language',
'coordinates lng',
'coordinates lat',
'place',
'user-location',
'user-geo_enabled',
'user-lang',
'user-time_zone',
'user-statuses_count',
'user-followers_count',
'user-friend_count',
'user-created_at',
'favorite_count',
'retweet_count',
'user-mentions',
'urls',
'hashtags',
'symbols'])
for tweet in results:
csvwriter.writerow([tweet['created_at'],
getVal(tweet['user']['screen_name']),
getVal(tweet['text']),
getVal(tweet['lang']),
getLng(tweet['coordinates']),
getLat(tweet['coordinates']),
getPlace(tweet['place']),
getVal(tweet['user']['location']),
getVal(tweet['user']['geo_enabled']),
getVal(tweet['user']['lang']),
getVal(tweet['user']['time_zone']),
getVal(tweet['user']['statuses_count']),
getVal(tweet['user']['followers_count']),
getVal(tweet['user']['friends_count']),
getVal(tweet['user']['created_at']),
getVal(tweet['favorite_count']),
getVal(tweet['retweet_count']),
tweet['entities']['user_mentions'],
tweet['entities']['urls'],
tweet['entities']['hashtags'],
tweet['entities']['symbols'],
])
print "done"
其中getVal,getLng和getLat在代码的前面定义为:
def getVal(val):
clean = ""
if isinstance(val, bool):
return val
if isinstance(val, int):
return val
if val:
clean = val.encode('utf-8')
return clean
def getLng(val):
if isinstance(val, dict):
return val['coordinates'][0]
def getLat(val):
if isinstance(val, dict):
return val['coordinates'][1]
def getPlace(val):
if isinstance(val, dict):
return val['full_name'].encode('utf-8')