Question

当替换本身是变量时，如何替换模式？

我有以下字符串：

s = '''[[merit|merited]] and [[eat|eaten]] and [[go]]'''

我只想在括号中保留最右边的字词（＆＃39;值得＆＃39;，＆＃39;吃掉＆＃39;，＆＃39; go＆＃39;），剥离周围的内容这些话，从而产生：

merited and eaten and go

我有正则表达式：

p = '''\[\[[a-zA-Z]*\[|]*([a-zA-Z]*)\]\]'''

......产生：

>>> re.findall(p, s)
['merited', 'eaten', 'go']

但是，由于这种情况不同，我没有看到使用re.sub（）或s.replace（）的方法。

Answer 1

s = '''[[merit|merited]] and [[eat|eaten]] and [[go]]'''
p = '''\[\[[a-zA-Z]*?[|]*([a-zA-Z]*)\]\]'''
re.sub(p, r'\1', s)

?以便[[go]]首先[a-zA-Z]*匹配空（最短）字符串，第二个将获得实际go字符串

\1替换字符串s中每个非重叠匹配的模式中的第一个（在本例中为唯一的）匹配组。使用r'\1'以便\1不被解释为具有代码0x1

的字符

Answer 2

首先，您需要修复正则表达式以捕获整个组：

>>> s = '[[merit|merited]] and [[eat|eaten]] and [[go]]'
>>> p = '(\[\[(?:[a-zA-Z]*\|)*([a-zA-Z]*)\]\])'
>>> [('[[merit|merited]]', 'merited'), ('[[eat|eaten]]', 'eaten'), ('[[go]]', 'go')]
[('[[merit|merited]]', 'merited'), ('[[eat|eaten]]', 'eaten'), ('[[go]]', 'go')]

这匹配整个[[whateverisinhere]]并将整个匹配分为第1组，将最后一个单词分为第2组。您可以使用\2令牌将整个匹配替换为第2组：< / p>

>>> re.sub(p,r'\2',s)
'merited and eaten and go'

或将您的模式更改为：

p = '\[\[(?:[a-zA-Z]*\|)*([a-zA-Z]*)\]\]'

除去将整个比赛分组为第1组，并且只对您想要的内容进行分组。然后你可以这样做：

>>> re.sub(p,r'\1',s)

具有相同的效果。

POST EDIT：

我忘了提到我实际上改变了你的正则表达式，所以这里是解释：

\[\[(?:[a-zA-Z]*\|)*([a-zA-Z]*)\]\]
\[\[                           \]\] #literal matches of brackets
    (?:           )* #non-capturing group that can match 0 or more of whats inside
       [a-zA-Z]*\| #matches any word that is followed by a '|' character
                    ( ...    ) #captures into group one the final word

我觉得这比你原来的要强，因为如果有两个以上的选择，它也会改变：

>>> s = '[[merit|merited]] and [[ate|eat|eaten]] and [[go]]'
>>> p = '\[\[(?:[a-zA-Z]*\|)*([a-zA-Z]*)\]\]'
>>> re.sub(p,r'\1',s)
'merited and eaten and go'

使用python re剥离变量边框

2 个答案: