使用python,如何处理两个文本文件。 例如:a.txt有5个组,b.txt也有4个组。 b.txt将查找a.txt上可用的组。如果找到,则将其写入output.txt,如果找不到,则不要将其写入output.txt。 组中的数字应该匹配,但顺序并不重要。
a.txt
GROUP :[11111, 22222, 33333]
GROUP :[22222, 11111]
GROUP :[46098]
GROUP :[66666, 55555, 44444]
GROUP :[55555, 44444]
b.txt
GROUP :[11111, 33333]
GROUP :[46098]
GROUP :[22222, 11111]
GROUP :[44444, 55555, 66666]
output.txt
GROUP :[22222, 11111]
GROUP :[46098]
GROUP :[44444, 55555, 66666]
答案 0 :(得分:1)
这不是世界上最漂亮的东西,但应该完成工作:
from collections import Counter
with open('a.txt', 'r') as a:
a_list = []
for line in a:
groups = line.split(':')[1]
groups = groups.split('[')[1].split(']')[0]
groups = groups.split(', ')
a_list.append(groups)
with open('b.txt', 'r') as b:
b_list = []
for line in b:
groups = line.split(':')[1]
groups = groups.split('[')[1].split(']')[0]
groups = groups.split(', ')
b_list.append(groups)
with open('output.txt', 'w') as output:
a_counter = [Counter(i) for i in a_list]
for group in b_list:
if Counter(group) in a_counter:
output.write(f"GROUP :{group}\n")
答案 1 :(得分:1)
使用正则表达式和re模块:
import re
grp_tmpl = list()
# Register all groups
f = open('b.txt', 'r')
for line in f.readlines():
grp_tmpl.append(sorted(re.findall('\d+', line)))
# Find groups
out = open('output.txt', 'w')
f = open('a.txt', 'r')
for line in f.readlines():
for t in grp_tmpl:
if t == sorted(re.findall('\d+', line)):
out.write(line)