Question

我有以下需要为网址编码的网址：This is currently the top headline on Reddit TIL Pimps wear lots of gold jewelry bought at pawn shops to “re-pawn” for bail money since cash is confiscated upon arrest but jewelry is not

我遇到了问题，因为此字符串包含unicode字符，特别是引号。

我已尝试urllib.quote_plus(message)，但这会引发以下异常：

Traceback (most recent call last):
  File "testProgram.py", line 44, in <module>
    main()                                      # Run
  File "testProgram.py", line 41, in main
    testProgram(headline)                                   # Make phone call
  File "testProgram.py", line 31, in testProgram
    urllib.quote_plus(message)
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1293, in quote_plus
    s = quote(s, safe + ' ')
  File "/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1288, in quote
    return ''.join(map(quoter, s))
KeyError: u'\u201c'

有人知道这是为什么吗？

Answer 1

如果message是Unicode字符串，请尝试：

urllib.quote_plus(message.encode('utf-8'))

唉，{p> utf-8并不是普遍使用的网址（我不认为有一个普遍接受的标准，唉），但由于其“普遍”性质（每个<），它非常普遍/ strong> Unicode字符可以用utf-8表示，而许多其他流行的编码则不然。）

编码URL的字符串 - Python

1 个答案: