我编写了以下代码来使用'utf-8'编码来抓取推文:
kws=[]
f=codecs.open("keywords", encoding='utf-8')
kws = f.readlines()
f.close()
print kws
for kw in kws:
timeline_endpoint ='https://api.twitter.com/1.1/search/tweets.json?q='+kw+'&count=100&lang=fr'
print timeline_endpoint
response, data = client.request(timeline_endpoint)
tweets = json.loads(data)
for tweet in tweets['statuses']:
my_on_data(json.dumps(tweet.encode('utf-8')))
time.sleep(3)
但是我收到以下错误:
response, data = client.request(timeline_endpoint)
File "build/bdist.linux-x86_64/egg/oauth2/__init__.py", line 676, in request
File "build/bdist.linux-x86_64/egg/oauth2/__init__.py", line 440, in to_url
File "/usr/lib/python2.7/urllib.py", line 1357, in urlencode
l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
我将不胜感激。
答案 0 :(得分:0)
好的,这是使用不同搜索方法的解决方案:
auth = tweepy.OAuthHandler("k1", "k2")
auth.set_access_token("k3", "k4")
api = tweepy.API(auth)
for kw in kws:
max_tweets = 10
searched_tweets = [status for status in tweepy.Cursor(api.search, q=kw.encode('utf-8')).items(max_tweets)]
for tweet in searched_tweets:
my_on_data(json.dumps(tweet._json))
time.sleep(3)