我有以下需要为网址编码的网址:This is currently the top headline on Reddit TIL Pimps wear lots of gold jewelry bought at pawn shops to “re-pawn” for bail money since cash is confiscated upon arrest but jewelry is not
我遇到了问题,因为此字符串包含unicode字符,特别是引号。
我已尝试urllib.quote_plus(message)
,但这会引发以下异常:
Traceback (most recent call last):
File "testProgram.py", line 44, in <module>
main() # Run
File "testProgram.py", line 41, in main
testProgram(headline) # Make phone call
File "testProgram.py", line 31, in testProgram
urllib.quote_plus(message)
File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1293, in quote_plus
s = quote(s, safe + ' ')
File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1288, in quote
return ''.join(map(quoter, s))
KeyError: u'\u201c'
有人知道这是为什么吗?
答案 0 :(得分:4)
如果message
是Unicode字符串,请尝试:
urllib.quote_plus(message.encode('utf-8'))
唉,{p> utf-8
并不是普遍使用的网址(我不认为有一个普遍接受的标准,唉),但由于其“普遍”性质(每个<),它非常普遍/ strong> Unicode字符可以用utf-8表示,而许多其他流行的编码则不然。)