缩短版本: 如何编写一个函数,它将包含一个包含字符串表示形式的字符串(例如“This is a character \ u200a”)并用它们代表的字符替换它们(例如“This is a character”)
更长的版本 我需要编写一个函数,它将采用以下字符串:
<p>" \'These things matter\' a lot—in Virginia, Florida, North Carolina, and Ohio—just to name a few states."</p>
并将其转换为
<p>" 'These things matter' a lot—in Virginia, Florida, North Carolina, and Ohio—just to name a few states."</p>
到目前为止,我提出的最好的方法是将其编码为cp1252,然后将其解码为utf-8
>>> x.encode("cp1252").decode("utf-8")
<p>"\u200a\'These things matter\' a lot—in Virginia, Florida, North Carolina, and Ohio—just to name a few states."</p>
是否有一个我可以编写的函数或一个库,它可以让我在使用\'字符并找到那个糟糕的“毛球空间”编码字符方面的最后一步?
谢谢!
答案 0 :(得分:2)
您的文字是正确的,但请使用print()
查看。
>>> x = " \'These things matter\' a lot—in Virginia, Florida, North Carolina, and Ohio—just to name a few states."
>>> print(x.encode("cp1252").decode("utf-8"))
'These things matter' a lot—in Virginia, Florida, North Carolina, and Ohio—just to name a few states.
Python Shell
通常使用print(repr(...))
自动显示每行代码的结果。它提供了比普通print(...)
>>> print(repr(x.encode("cp1252").decode("utf-8")))
"\u200a\'These things matter\' a lot—in Virginia, Florida, North Carolina, and Ohio—just to name a few states."