Question

我有这样的网址：

http://idebate.org/debatabase/debates/constitutional-governance/house-supports-dalai-lama%E2%80%99s-%E2%80%98third-way%E2%80%99-tibet

然后我在python中使用以下脚本来解码这个url：

full_href = urllib.unquote(full_href.encode('ascii')).decode('utf-8')

然而，我得到了这样的错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 89: ordinal not in range(128)

尝试写入文件时

Answer 1

就像@ KevinJ.Chase指出的那样，你很可能试图用不兼容的ascii格式写一个带有字符串的文件。您可以更改写文件编码，也可以将full_href编码为ascii，如下所示：

# don't decode again to utf-8
full_href = urllib.unquote(url.encode('ascii'))
... then write to your file stream

，或者

...
# encode your your to compatible encoding on write, ie. utf-8
with open('yourfilenamehere', 'w') as f:
    f.write(full_href.encode('utf-8'))