Question

我有一个表达式列表，我想替换文件中的每个表达式。

我试过这段代码

for a in ex:
   if a in file.split():
       file = file.replace(a, '[' + ' ' + a + ' ' +']')
print file

我的代码也替换了括号中另一个表达式的表达式。所以我想要的是只替换括号中不属于另一个表达式的表达式。我怎样才能得到理想的结果？

Answer 1

您可以通过重新模块执行此操作。这里模式的顺序非常重要。由于'organizations of human rights'位于'human rights'之前，因此正则表达式引擎会尝试首先找到organizations of human rights此字符串。如果找到匹配项，则会将匹配替换为[ +匹配+ ]。然后它继续前进到下一个模式，即human rights是否通过前一个模式找到匹配。现在，此human rights模式将匹配human rights字符串中不存在的所有organizations of human rights字符串。因为默认情况下正则表达式不会重叠匹配。如果您希望正则表达式模式执行重叠匹配，则需要将模式置于外观中，并且模式必须由()（即捕获组）包围。

>>> ex = ['liberty of freedom', 'liberty', 'organizations of human rights', 'human rights']
>>> file = " The american people enjoys a liberty of freedom and there are many international organizations of human rights."
>>> reg = '|'.join(ex)
>>> import re
>>> re.sub('('+reg+')', r'[\1]', file)
' The american people enjoys a [liberty of freedom] and there are many international [organizations of human rights].'

替换文件中的字符串

1 个答案: