Question

所以我有这样的字典：

corrections = {'L.C.M.':'LCM','L.C.M':'LCM'}
sometext = 'L.C.M is cool but L.C.M. is not L.C.Mwhichshouldnotchange'
expected = 'LCM is cool but LCM is not L.C.Mwhichshouldnotchange'

我需要将第一个和第二个替换为LCM，所以我写了这段代码

for abbr,replace in corrections.items():
    pattern = r'\b' + re.escape(abbr) + r'\b'
    sometext = re.sub(pattern, replace, sometext)

此代码有些工作，但

L.C.M. -> (Apply 1st replacement) -> LCM. (which is wrong)
L.C.M. -> (Apply 2nd replacement) -> LCM  (right)

我需要一个简单的替换代码，因为我有一个很大的缩写列表来替换

Answer 1

重要的是corrections.items()的顺序。在这种情况下，corrections.items()会产生L.C.M，然后L.C.M.，这不是令人遗憾的顺序。我的解决方案是使用反向字典顺序检查corrections的项目。反向字典顺序产生L.C.M.然后L.C.M。为此，请替换

for abbr,replace in corrections.items():

与

rev_dict_items = sorted(list(corrections.items()))
rev_dict_items.reverse()
for abbr,replace in rev_dict_items:

Answer 2

问题在于匹配L.C.M.，因为它后跟一个点，你也在使用\b来应用单词边界。字边界不会与最后一个点(.)匹配，因为它后面没有任何字母数字字符。

如果你想确保你想要匹配的字符串后面没有非空白字符，你可以这样做：

\b(L\.C\.M\.)(?!\S)

(?!\S)这里将确保您的匹配字符串后面没有非空白字符。因此，如果之后有空格或者它是字符串的结尾，它将匹配。

IDEONE DEMO

在Python中将整个单词替换与Regex字符组合在一起

2 个答案: