搜索中文文本会抛出UnicodeEncodeError

时间:2013-04-28 22:18:39

标签: python unicode python-2.7 python-twitter

我正在使用python-twiter使用Twitter的API搜索推文,我对中文术语有疑问。这是一个重现问题的最小代码示例:

# -*- coding: utf-8 -*-
import twitter

api = twitter.Api(consumer_key = "...", consumer_secret = "...",
                  access_token_key = "...", access_token_secret = "...")

api.VerifyCredentials()
print u"您说英语吗"
r = api.GetSearch(term=u"您说英语吗")

我收到此错误:

您说英语吗
Traceback (most recent call last):
          File "so.py", line 9, in <module>
    r = api.GetSearch(term=u"您说英语吗")
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_twitter-0.8.7-py2.7.egg/twitter.py", line 2419, in GetSearch
    json = self._FetchUrl(url, parameters=parameters)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_twitter-0.8.7-py2.7.egg/twitter.py", line 4041, in _FetchUrl
    url = req.to_url()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/oauth2-1.5.211-py2.7.egg/oauth2/__init__.py", line 440, in to_url
    urllib.urlencode(query, True), fragment)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1337, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)

2 个答案:

答案 0 :(得分:2)

好像GetSearch中有一个错误:https://code.google.com/p/python-twitter/issues/detail?id=210。我试图用俄语搜索“普京”(“Путин”)并得到同样的错误。使用编码没有帮助。

作为解决方法,您可以使用twitter包(https://github.com/sixohsix/twitter):

# -*- coding: utf-8 -*-
from twitter import *

t = Twitter(auth=OAuth(token="...", token_secret="...", consumer_key="...", consumer_secret="...")))

print t.search.tweets(q=u"您说英语吗")

答案 1 :(得分:0)

另外,在使用非英文文本

之前,请尝试添加以下代码
  

导入sys

     

重载(SYS)

     

sys.setdefaultencoding函数( “UTF-8”)