Question

我有很多txt文件，我需要替换它们上的一些文本。几乎所有人都有这个non-ascii字符（我认为它是"..."，但......不一样）我试过replace()，但我做不到，我需要一些帮助!!提前谢谢

Answer 1

如果您使用codecs.open()打开文件，那么您将获得所有字符串unicode，即much easier to handle。

Answer 2

使用unicode类型字符串。例如，

>>> print u'\xe2'.replace(u'\xe2','a')
a

Answer 3

问题是这些字符无效str，它们是unicode。

import re
re.sub(r'<string to repleace>','',text,re.U)

大多数其他答案也会起作用