我想将图像从reddit保存到本地文件:
import urllib.request
urllib.request.urlretrieve(post.url, "local-filename.jpg")
要进行复制,只需将post.url
替换为包含“ä”的URL
但是有一天我偶然发现了一个错误:reddit帖子网址的标题中带有“ä”。
我知道这与一个基本的Python unicode问题有关,并且我读了UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128),但是我不知道如何将修复程序应用于我的问题。
这是我得到的错误:
https://www.reddit.com/r/OkBrudiMongo/comments/cogkye/hä_was_ist_los_petersans/ Hä, was ist los Peter-Sans?!
Traceback (most recent call last):
File "C:/Users/Administrator/Documents/DefaultProject/Main.py", line 55, in <module>
Mainbot()
File "C:/Users/Administrator/Documents/DefaultProject/Main.py", line 48, in Mainbot
Mainbot()
File "C:/Users/Administrator/Documents/DefaultProject/Main.py", line 38, in Mainbot
urllib.request.urlretrieve(post.url, "local-filename.jpg")
File "C:\Program Files\Python37\lib\urllib\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Program Files\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files\Python37\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Program Files\Python37\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Program Files\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Program Files\Python37\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Program Files\Python37\lib\urllib\request.py", line 1317, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "C:\Program Files\Python37\lib\http\client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Program Files\Python37\lib\http\client.py", line 1240, in _send_request
self.putrequest(method, url, **skips)
File "C:\Program Files\Python37\lib\http\client.py", line 1107, in putrequest
self._output(request.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 37: ordinal not in range(128)