任何时候我想替换一段文字,这段文字都是大文字的一部分,我总是要这样做:
"(?P<start>some_pattern)(?P<replace>foo)(?P<end>end)"
然后将start
组与replace
的新数据以及end
组的新数据连接起来。
有更好的方法吗?
答案 0 :(得分:106)
>>> import re
>>> s = "start foo end"
>>> s = re.sub("foo", "replaced", s)
>>> s
'start replaced end'
>>> s = re.sub("(?<= )(.+)(?= )", lambda m: "can use a callable for the %s text too" % m.group(1), s)
>>> s
'start can use a callable for the replaced text too end'
>>> help(re.sub)
Help on function sub in module re:
sub(pattern, repl, string, count=0)
Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl. repl can be either a string or a callable;
if a callable, it's passed the match object and must return
a replacement string to be used.
答案 1 :(得分:18)
在Python re documentation中查找前瞻(?=...)
和lookbehinds (?<=...)
- 我很确定它们就是你想要的。它们匹配字符串,但不“消耗”它们匹配的字符串的位。
答案 2 :(得分:11)
简短版本是在使用Python的re
模块的外观中无法使用可变宽度模式。没有办法改变这个:
>>> import re
>>> re.sub("(?<=foo)bar(?=baz)", "quux", "foobarbaz")
'fooquuxbaz'
>>> re.sub("(?<=fo+)bar(?=baz)", "quux", "foobarbaz")
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
re.sub("(?<=fo+)bar(?=baz)", "quux", string)
File "C:\Development\Python25\lib\re.py", line 150, in sub
return _compile(pattern, 0).sub(repl, string, count)
File "C:\Development\Python25\lib\re.py", line 241, in _compile
raise error, v # invalid expression
error: look-behind requires fixed-width pattern
这意味着你需要解决它,最简单的解决方案与你现在正在做的非常相似:
>>> re.sub("(fo+)bar(?=baz)", "\\1quux", "foobarbaz")
'fooquuxbaz'
>>>
>>> # If you need to turn this into a callable function:
>>> def replace(start, replace, end, replacement, search):
return re.sub("(" + re.escape(start) + ")" + re.escape(replace) + "(?=" + re.escape + ")", "\\1" + re.escape(replacement), search)
这不具备外观解决方案的优雅,但它仍然是一个非常清晰,直接的单线程。如果你看看an expert has to say on the matter(他正在谈论完全缺乏外观的JavaScript,但许多原则是相同的),你会发现他最简单的解决方案看起来很像这个。
答案 3 :(得分:4)
我认为最好的想法就是在组中捕获您想要替换的内容,然后使用捕获的组的开始和结束属性替换它。
问候
阿德里安
#the pattern will contain the expression we want to replace as the first group
pat = "word1\s(.*)\sword2"
test = "word1 will never be a word2"
repl = "replace"
import re
m = re.search(pat,test)
if m and m.groups() > 0:
line = test[:m.start(1)] + repl + test[m.end(1):]
print line
else:
print "the pattern didn't capture any text"
这将打印: 'word1永远不会成为word2'
要替换的组可以位于字符串的任何位置。