我在使用Python 3中的请求库时遇到了问题。
这是我的代码:
url = 'https://www.contrataciones.gov.py/images/opendata/planificaciones/2016.csv'
r = requests.get(url)
reader = csv.DictReader(r.content.splitlines())
当我使用Python 2运行脚本时它完全正常,但是使用Python 3我在requests.get(url)
行遇到了失败:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 359: ordinal not in range(128)
我做错了什么?我知道如何解码内容等等,但我直接从请求中获取错误而引发错误。
更新:完全追溯 - 看起来可能与pickle有关吗?
File "fetch.py", line 128, in <module>
main()
File "fetch.py", line 115, in main
id_list = fetchList(options.year)
File "/usr/local/lib/python3.6/site-packages/ratelimit/__init__.py", line 21, in func_wrapper
ret = func(*args, **kargs)
File "fetch.py", line 89, in fetchList
r = requests.get(url)
File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests_cache/core.py", line 126, in request
**kwargs
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.6/site-packages/requests_cache/core.py", line 97, in send
response, timestamp = self.cache.get_response_and_time(cache_key)
File "/usr/local/lib/python3.6/site-packages/requests_cache/backends/base.py", line 70, in get_response_and_time
if key not in self.responses:
File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/_collections_abc.py", line 666, in __contains__
self[key]
File "/usr/local/lib/python3.6/site-packages/requests_cache/backends/storage/dbdict.py", line 163, in __getitem__
return pickle.loads(bytes(super(DbPickleDict, self).__getitem__(key)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 359: ordinal not in range(128)
答案 0 :(得分:0)
在Python 2中,隐式str类型是ASCII。但是在Python 3.x中,隐式str类型是Unicode。
print(type('default string '))
print(type(b'string with b '))
'''
Output in Python 2.x (Bytes is same as str)
<type 'str'>
<type 'str'>
Output in Python 3.x (Bytes and str are different)
<class 'str'>
<class 'bytes'>
'''
第89行的错误:r = requests.get(url)
在python 3.x Bytes和str是不同的。所以python 3.x将字节转换为默认编码UTF-8,你需要将它解码回以前的值。
这可能会解决您的问题,请尝试一下 在第89行之前添加这些行:
try:
url = url.decode("utf-8")
except UnicodeEncodeError:
pass
这里有解释,解码函数仅适用于编码字符串
>>> s = "x,jnvjlf"
>>> d = s.encode("utf-8")
>>> d
b'x,jnvjlf'
>>> e = s.decode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
>>> e = d.decode("utf-8")
>>> e
'x,jnvjlf'
答案 1 :(得分:-1)
您必须在请求之前解码您的网址。
试试这个:
newurl = url.decode("utf-8")
r = requests.get(newurl)