假设我在名为“main”的列表中有一堆字符串。如何遍历“main”,如果找到匹配项,则删除“main”中匹配的部分,然后将匹配的文本添加到名为“new”的新列表中?
蟒
main = ['text \fc + \fr this is my match1 \fc* text', 'text \fc + \fr this is my match2 \fc* text', 'text', 'text', 'text \fc + \fr this is my match \fc* text']
new = []
def rematch(pattern, inp):
matcher = re.compile(pattern)
matches = matcher.match(inp)
if matches:
new.append(matches)
#remove match from "main" somehow?
for x in main:
for m in rematch('\\fc \+ \\fr(.*?)\\fc\*', x):
结果:
main = ['text text', 'text text', 'text', 'text', 'text text']
new = ['this is my match1', 'this is my match2', 'this is my match3']
答案 0 :(得分:2)
In [33]: import re
In [34]: pat = re.compile('\\fc \+ \\fr(.*?)\\fc\*')
In [43]: main, new = zip(*[(''.join(parts[::2]), ''.join(parts[1::2])) for parts in [pat.split(m) for m in main]])
In [44]: new = [n.strip() for n in new if n]
In [45]: main
Out[45]: ('text text', 'text text', 'text', 'text', 'text text')
In [46]: new
Out[46]: ['this is my match1', 'this is my match2', 'this is my match']
说明:
请注意使用pat.split
时会发生什么:
In [37]: pat.split(main[0])
Out[37]: ['text ', ' this is my match1 ', ' text']
除了您想要main
中的奇数术语和new
中的偶数术语之外,这与您想要的类似。我们将在一秒钟内完成。
首先,我们将pat.split
应用于main
中的每个项目:
In [51]: [pat.split(m) for m in main]
Out[51]:
[['text ', ' this is my match1 ', ' text'],
['text ', ' this is my match2 ', ' text'],
['text'],
['text'],
['text ', ' this is my match ', ' text']]
接下来,让我们将奇数项与偶数项分开,并使用''.join
将项目一起刷成一个字符串:
In [52]: [(''.join(parts[::2]), ''.join(parts[1::2])) for parts in [pat.split(m) for m in main]]
Out[52]:
[('text text', ' this is my match1 '),
('text text', ' this is my match2 '),
('text', ''),
('text', ''),
('text text', ' this is my match ')]
从这里开始,我们可以使用zip(*...)
将main
与new
分开:
In [53]: main, new = zip(*[(''.join(parts[::2]), ''.join(parts[1::2])) for parts in [pat.split(m) for m in main]])
In [54]: main
Out[54]: ('text text', 'text text', 'text', 'text', 'text text')
In [55]: new
Out[55]: (' this is my match1 ', ' this is my match2 ', '', '', ' this is my match ')