我有一个输入文件,
1A Traes_1AS_6052071D9.1 99.01 101 99.0
1A Traes_1DS_6BA87D1DA.1 96.04 101 99.0
1A Traes_1BS_480915AD0.1 94.06 101 99.0
1B Traes_1AS_49D585BA6.2 99.01 101 72.0
1B Traes_1BS_47F027BBE.2 98.02 101 89.0
1B Traes_1DS_3F816B920.1 97.03 101 92.0
1C Traes_1AS_3451447E0.1 99.01 101 97.0
1C Traes_1BS_9F243CEA6.2 92.93 99 97.0
1C Traes_1DS_2A6443F45.1 89.90 99 97.0
我需要
line[0]
内进行迭代,line[4]
并取最高值line[2]
中具有最高值的结果打印结果,以便我的输出文件如下所示:必需的输出:
1A Traes_1AS_6052071D9.1 99.01 101 99.0
1B Traes_1DS_3F816B920.1 97.03 101 92.0
1C Traes_1AS_3451447E0.1 99.01 101 97.0
这是我的尝试,但只需要根据最高line[4]
:
import csv
from itertools import groupby
from operator import itemgetter
with open('my_file','rb') as f1:
with open('out_file', 'wb') as f2:
reader = csv.reader(f1, delimiter='\t')
writer1 = csv.writer(f2, delimiter='\t')
for group, rows in groupby(reader, itemgetter(0)):
seen = set()
rows = sorted(rows, key=lambda r: float(r[4]))
for row in rows:
max(rows, key=lambda r: float(r[4]))
writer1.writerow(row)
答案 0 :(得分:3)
让key
的{{1}}函数返回max
元组
稍微简化的示例(没有输出文件)
(r[4], r[2])