在python中解码以下网址

时间:2014-11-06 00:59:47

标签: python decode encode

我有这样的网址:

http://idebate.org/debatabase/debates/constitutional-governance/house-supports-dalai-lama%E2%80%99s-%E2%80%98third-way%E2%80%99-tibet

然后我在python中使用以下脚本来解码这个url:

full_href = urllib.unquote(full_href.encode('ascii')).decode('utf-8')

然而,我得到了这样的错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 89: ordinal not in range(128)

尝试写入文件时

1 个答案:

答案 0 :(得分:0)

就像@ KevinJ.Chase指出的那样,你很可能试图用不兼容的ascii格式写一个带有字符串的文件。 您可以更改写文件编码,也可以将full_href编码为ascii,如下所示:

# don't decode again to utf-8
full_href = urllib.unquote(url.encode('ascii'))
... then write to your file stream

,或者

...
# encode your your to compatible encoding on write, ie. utf-8
with open('yourfilenamehere', 'w') as f:
    f.write(full_href.encode('utf-8'))