Python将文本解码为ascii

时间:2011-09-23 14:26:57

标签: python unicode decode

如何解码这样的unicode字符串:

  

什么%2527s%2bthe%2btime%252C%2bnow%253F

像这样的ascii:

  

什么+的+时间+现在

3 个答案:

答案 0 :(得分:6)

在你的情况下,字符串被解码了两次,所以我们需要两次取消引用才能将它取回

In [1]: import urllib
In [2]: urllib.unquote(urllib.unquote("what%2527s%2bthe%2btime%252c%2bnow%253f") )
Out[3]: "what's+the+time,+now?"

答案 1 :(得分:0)

这样的东西?

title = u"what%2527s%2bthe%2btime%252c%2bnow%253f"
print title.encode('ascii','ignore')

另外,请查看this

答案 2 :(得分:0)

您可以使用以下内容转换%(hex)转义字符:

import re

def my_decode(s):
    re.sub('%([0-9a-fA-F]{2,4})', lambda x: unichr(int(x.group(1), 16)), s)

s = u'what%2527s%2bthe%2btime%252c%2bnow%253f'
print my_decode(s)

产生unicode字符串

u'what\u2527s+the+time\u252c+now\u253f'

不知道如何将\ u2527转换为单引号,或在转换为ascii时删除\ u253f和\ u252c字符