如果URL中有特殊字符(“ä”),为什么urllib.request.urlretrieve失败?

时间:2019-10-07 06:08:39

标签: python url utf-8 request urllib

我想将图像从reddit保存到本地文件:

import urllib.request
urllib.request.urlretrieve(post.url, "local-filename.jpg")

要进行复制,只需将post.url替换为包含“ä”的URL

但是有一天我偶然发现了一个错误:reddit帖子网址的标题中带有“ä”。

我知道这与一个基本的Python unicode问题有关,并且我读了UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128),但是我不知道如何将修复程序应用于我的问题。

这是我得到的错误:

https://www.reddit.com/r/OkBrudiMongo/comments/cogkye/hä_was_ist_los_petersans/ Hä, was ist los Peter-Sans?!
Traceback (most recent call last):
  File "C:/Users/Administrator/Documents/DefaultProject/Main.py", line 55, in <module>
    Mainbot()
  File "C:/Users/Administrator/Documents/DefaultProject/Main.py", line 48, in Mainbot
    Mainbot()
  File "C:/Users/Administrator/Documents/DefaultProject/Main.py", line 38, in Mainbot
    urllib.request.urlretrieve(post.url, "local-filename.jpg")
  File "C:\Program Files\Python37\lib\urllib\request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "C:\Program Files\Python37\lib\urllib\request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Program Files\Python37\lib\urllib\request.py", line 525, in open
    response = self._open(req, data)
  File "C:\Program Files\Python37\lib\urllib\request.py", line 543, in _open
    '_open', req)
  File "C:\Program Files\Python37\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\Program Files\Python37\lib\urllib\request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\Program Files\Python37\lib\urllib\request.py", line 1317, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "C:\Program Files\Python37\lib\http\client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Program Files\Python37\lib\http\client.py", line 1240, in _send_request
    self.putrequest(method, url, **skips)
  File "C:\Program Files\Python37\lib\http\client.py", line 1107, in putrequest
    self._output(request.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 37: ordinal not in range(128)

0 个答案:

没有答案