python编码问题,搜索推文

时间:2016-08-16 16:24:04

标签: python encoding urllib tweepy

我编写了以下代码来使用'utf-8'编码来抓取推文:

kws=[]        
f=codecs.open("keywords", encoding='utf-8')
kws = f.readlines()
f.close()
print kws

for kw in kws:
    timeline_endpoint ='https://api.twitter.com/1.1/search/tweets.json?q='+kw+'&count=100&lang=fr'
    print timeline_endpoint
    response, data = client.request(timeline_endpoint)
    tweets = json.loads(data)
    for tweet in tweets['statuses']:
        my_on_data(json.dumps(tweet.encode('utf-8')))
    time.sleep(3)

但是我收到以下错误:

response, data = client.request(timeline_endpoint)
File "build/bdist.linux-x86_64/egg/oauth2/__init__.py", line 676, in request
File "build/bdist.linux-x86_64/egg/oauth2/__init__.py", line 440, in to_url
File "/usr/lib/python2.7/urllib.py", line 1357, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

我将不胜感激。

1 个答案:

答案 0 :(得分:0)

好的,这是使用不同搜索方法的解决方案:

auth = tweepy.OAuthHandler("k1", "k2")
auth.set_access_token("k3", "k4")
api = tweepy.API(auth)

for kw in kws:
            max_tweets = 10
            searched_tweets = [status for status in tweepy.Cursor(api.search, q=kw.encode('utf-8')).items(max_tweets)]                
            for tweet in searched_tweets:
                my_on_data(json.dumps(tweet._json))
            time.sleep(3)