如何按两行排序并打印具有最高值的行

时间:2015-05-12 09:52:57

标签: python sorting

我有一个输入文件,

1A  Traes_1AS_6052071D9.1   99.01   101 99.0    
1A  Traes_1DS_6BA87D1DA.1   96.04   101 99.0    
1A  Traes_1BS_480915AD0.1   94.06   101 99.0    
1B  Traes_1AS_49D585BA6.2   99.01   101 72.0    
1B  Traes_1BS_47F027BBE.2   98.02   101 89.0    
1B  Traes_1DS_3F816B920.1   97.03   101 92.0    
1C  Traes_1AS_3451447E0.1   99.01   101 97.0
1C  Traes_1BS_9F243CEA6.2   92.93   99  97.0    
1C  Traes_1DS_2A6443F45.1   89.90   99  97.0    

我需要

  1. 分组并在每个line[0]内进行迭代,
  2. 从最低到最高值排序line[4]并取最高值
  3. 如果它们相似,请选择line[2]中具有最高值的结果打印结果,以便我的输出文件如下所示:
  4. 必需的输出:

    1A  Traes_1AS_6052071D9.1   99.01   101 99.0    
    1B  Traes_1DS_3F816B920.1   97.03   101 92.0    
    1C  Traes_1AS_3451447E0.1   99.01   101 97.0    
    

    这是我的尝试,但只需要根据最高line[4]

    import csv
    from itertools import groupby
    from operator import itemgetter
    with open('my_file','rb') as f1:
    with open('out_file', 'wb') as f2:
        reader = csv.reader(f1, delimiter='\t')
        writer1 = csv.writer(f2, delimiter='\t')
        for group, rows in groupby(reader, itemgetter(0)):
            seen = set()
            rows = sorted(rows, key=lambda r: float(r[4]))
            for row in rows:
                max(rows, key=lambda r: float(r[4]))
                writer1.writerow(row)
    

1 个答案:

答案 0 :(得分:3)

key的{​​{1}}函数返回max元组

稍微简化的示例(没有输出文件)

(r[4], r[2])