按照第一个索引查看排序列表的最优雅方法是什么?输入:
Meni22 xxxx xxxx
Meni32_2 xxxx xxxx
Meni32_2 xxxx xxxx
Meni45_1 xxxx xxxx
Meni45_1 xxxx xxxx
Meni45 xxxx xxxx
是否逐行:
list1 = []
list2 = []
for line in input:
if line[0] not in list1:
list.append(line)
else:
list2.append(line)
示例显然不起作用。它添加了第[0]行的第一个匹配并继续。我宁愿让它通过列表,添加到list1行,它只找到一次,然后休息到list2。
脚本之后:
List1:
Meni22 xxxx xxxx
Meni45 xxxx xxxx
List2:
Meni45_1 xxxx xxxx
Meni45_1 xxxx xxxx
Meni32_2 xxxx xxxx
Meni32_2 xxxx xxxx
答案 0 :(得分:3)
由于文件已排序,您可以使用groupby
from itertools import groupby
list1, list2 = res = [], []
with open('file1.txt', 'rb') as fin:
for k,g in groupby(fin, key=lambda x:x.partition(' ')[0]):
g = list(g)
res[len(g) > 1] += g
或者如果你喜欢这个更长的版本
from itertools import groupby
list1, list2 = [], []
with open('file1.txt', 'rb') as fin:
for k,g in groupby(fin, key=lambda x:x.partition(' ')[0]):
g = list(g)
if len(g) > 1:
list2 += g
else:
list1 += g
答案 1 :(得分:2)
您可以使用collections.Counter
:
from collections import Counter
lis1 = []
lis2 = []
with open("abc") as f:
c = Counter(line.split()[0] for line in f)
for key,val in c.items():
if val == 1:
lis1.append(key)
else:
lis2.extend([key]*val)
print lis1
print lis2
<强>输出:强>
['Meni45', 'Meni22']
['Meni32_2', 'Meni32_2', 'Meni45_1', 'Meni45_1']
修改强>
from collections import defaultdict
lis1 = []
lis2 = []
with open("abc") as f:
dic = defaultdict(list)
for line in f:
spl =line.split()
dic[spl[0]].append(spl[1:])
for key,val in dic.items():
if len(val) == 1:
lis1.append(key)
else:
lis2.append(key)
print lis1
print lis2
print dic["Meni32_2"] #access columns related to any key from the the dict
<强>输出:强>
['Meni45', 'Meni22']
['Meni32_2', 'Meni45_1']
[['xxxx', 'xxxx'], ['xxxx', 'xxxx']]
答案 2 :(得分:1)
考虑使用difflib
import difflib
d = difflib.Differ()
fa = open('a.txt'); fb = open('b.txt')
diff = d.compare("".join(fa.readlines()), "".join(fb.readlines()))
print ''.join(list(diff))
fa.close(); fb.close()