requests.get在python 3中抛出UnicodeEncodeError?

时间:2017-06-20 18:22:19

标签: python python-3.x python-requests

我在使用Python 3中的请求库时遇到了问题。

这是我的代码:

url = 'https://www.contrataciones.gov.py/images/opendata/planificaciones/2016.csv'
r = requests.get(url)
reader = csv.DictReader(r.content.splitlines())

当我使用Python 2运行脚本时它完全正常,但是使用Python 3我在requests.get(url)行遇到了失败:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 359: ordinal not in range(128)

我做错了什么?我知道如何解码内容等等,但我直接从请求中获取错误而引发错误。

更新:完全追溯 - 看起来可能与pickle有关吗?

  File "fetch.py", line 128, in <module>
    main()
  File "fetch.py", line 115, in main
    id_list = fetchList(options.year)
  File "/usr/local/lib/python3.6/site-packages/ratelimit/__init__.py", line 21, in func_wrapper
    ret = func(*args, **kargs)
  File "fetch.py", line 89, in fetchList
    r = requests.get(url)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 70, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 56, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests_cache/core.py", line 126, in request
    **kwargs
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests_cache/core.py", line 97, in send
    response, timestamp = self.cache.get_response_and_time(cache_key)
  File "/usr/local/lib/python3.6/site-packages/requests_cache/backends/base.py", line 70, in get_response_and_time
    if key not in self.responses:
  File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/_collections_abc.py", line 666, in __contains__
    self[key]
  File "/usr/local/lib/python3.6/site-packages/requests_cache/backends/storage/dbdict.py", line 163, in __getitem__
    return pickle.loads(bytes(super(DbPickleDict, self).__getitem__(key)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 359: ordinal not in range(128)

2 个答案:

答案 0 :(得分:0)

在Python 2中,隐式str类型是ASCII。但是在Python 3.x中,隐式str类型是Unicode。

print(type('default string '))
print(type(b'string with b '))

'''
Output in Python 2.x (Bytes is same as str)
<type 'str'>
<type 'str'>

Output in Python 3.x (Bytes and str are different)
<class 'str'>
<class 'bytes'>
'''

第89行的错误:r = requests.get(url) 在python 3.x Bytes和str是不同的。所以python 3.x将字节转换为默认编码UTF-8,你需要将它解码回以前的值。

这可能会解决您的问题,请尝试一下 在第89行之前添加这些行:

try:
    url = url.decode("utf-8")
except UnicodeEncodeError: 
    pass

这里有解释,解码函数仅适用于编码字符串

>>> s = "x,jnvjlf"
>>> d = s.encode("utf-8")
>>> d
b'x,jnvjlf'
>>> e = s.decode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
>>> e = d.decode("utf-8")
>>> e
'x,jnvjlf'

答案 1 :(得分:-1)

您必须在请求之前解码您的网址。

试试这个:

newurl = url.decode("utf-8")
r = requests.get(newurl)