文本文件包含
This is line ABC XYZ. This is something. This is ABC XYZ. foo. This is ABC XYZ. foo
所需的输出是
This is line 1 ABC XYZ. This is something. This is 2 ABC XYZ. foo. This is 3 ABC XYZ. foo
所以问题是用 n ABC XYZ 替换第n次出现的 ABC XYZ 。
答案 0 :(得分:1)
您可以使用列表理解
a="This is line ABC XYZ. This is something. This is ABC XYZ. foo. This is ABC XYZ. foo"
''.join([e+str(c+1)+" ABC XYZ" for c,e in enumerate(a.split("ABC XYZ"))][0:-1])+a.split("ABC XYZ.")[-1]
答案 1 :(得分:1)
方法re.sub
可以将函数作为第二个参数。使用具有itertools.count
对象的有状态函数作为计数器。
import re, itertools
s = 'This is line ABC XYZ. This is something. This is ABC XYZ. foo. This is ABC XYZ. foo'
def enumerator():
counter = itertools.count(1)
return lambda m: '{} {}'.format(next(counter), m.group())
out = re.sub(r'ABC XYZ', enumerator(), s)
print(out)
函数enumerator
可以重用于任何模式。
This is line 1 ABC XYZ. This is something. This is 2 ABC XYZ. foo. This is 3 ABC XYZ. foo
答案 2 :(得分:0)
<强>代码强>:
import re
text = "This is line ABC XYZ. This is something. This is ABC XYZ. foo. This is ABC XYZ. foo"
x = re.split("(ABC XYZ)",text)
c=0
for i,s in enumerate(x):
if re.match('(ABC XYZ)',x[i]):
c+=1
x[i] = str(c)+' '+x[i]
x = ''.join(x) # This is line 1 ABC XYZ. This is something. This is 2 ABC XYZ. foo. This is 3 ABC XYZ. foo
您可以使用更优化的方法,但这有助于您更好地理解它。