我希望用#'#'替换正则表达式匹配组。字符。
将有可变数量的正则表达式,其中包含可变数量的组。
仅应替换正则表达式组的值。
#! /usr/bin/python
import re
data = """Line1 '4658'
Line2 data 'AAA'\tBBB\t55555
Roach""".splitlines()
# a variable number of Regex's containing a variable number of groups
needles = [ r"Line1 '(\d+)'",
r"'(AAA)'\t\S+\t(\S+)",
r"(Roach)" ]
pattern = re.compile('|'.join(needles))
for line in data:
match = pattern.search(line)
if (match):
print(re.sub(match.string[match.start():match.end()], '#' * len(match.string), line))
# current output
"""
############
Line2 data ##########################
#####
"""
# desired output
"""
Line1 '####'
Line2 data '###' BBB #####
#####
"""
答案 0 :(得分:0)
修改如下代码:
#! /usr/bin/python
import re
data = """Line1 '4658'
Line2 data 'AAA'\tBBB\t55555
Roach""".splitlines()
# a variable number of Regex 's containing a variable number of groups
needles = [r "Line1 '(\d+)'",
r "'(AAA)'\t\S+\t(\S+)",
r "(Roach)"
]
pattern = re.compile('|'.join(needles))
for line in data:
match = pattern.search(line)
for matched_str in match.groups():
if (matched_str):
line = re.sub(matched_str, '#' * len(matched_str), line)
print(line)
跑步的时候:
$ python a.py
Line1 '####'
Line2 data '###' BBB #####
#####
答案 1 :(得分:0)
您无需使用re.search()
进行额外匹配。您只需要更改正则表达式,以便它们可以匹配字符串的所有部分,然后使用适当的函数来替换目标部分。
以下是其中一个句子的示例:
In [51]: def replacer(x):
matched = x.groups()
if len(matched) == 4:
return "{}{}{}{}".format(matched[0], len(matched[1]) * '*', matched[2], len(matched[3]) * '*')
....:
In [52]: pattern = re.compile(r"([^']*)'(AAA)'(\t\S+\t)(\S+)")
In [53]: pattern.sub(replacer, "Line2 data 'AAA'\tBBB\t55555")
Out[53]: 'Line2 data ***\tBBB\t*****'
以下是完整的代码:
import re
data = """Line1 '4658'
Line2 data 'AAA'\tBBB\t55555
Roach""".splitlines()
# a variable number of Regex's containing a variable number of groups
needles = [ r"(Line1 )'(\d+)'",
r"([^']*)'(AAA)'(\t\S+\t)(\S+)",
r"(Roach)" ]
def replacer(x):
matched = x.groups()
if matched[2]:
# in this case groups from 3rd index have been matched
return "{}{}{}{}".format(matched[2], len(matched[3]) * '#', matched[4], len(matched[5]) * '#')
elif matched[0]:
# in this case groups from 1st index have been matched
return "{}{}".format(matched[0], len(matched[1]) * '#')
elif matched[-1]:
# in this case last group has been matched
return len(matched[-1]) * '#'
pattern = re.compile('|'.join(["{}".format(i) for i in needles]))
for line in data:
print(pattern.sub(replacer, line))
输出:
Line1 ####
Line2 data ### BBB #####
#####