Question

我正在对csv文件进行一些过滤，其中每个标题都有许多具有不同预测值的重复ID，因此第2列（pythoniac）是不同的。我想只保留30个最低值但具有唯一ID。我来到这个代码，但我不知道如何保留最低的30个条目。

请帮助您提供如何通过ID条目获取30个唯一身份的建议吗？

# title1    id1 100 7.78E-25 # example of the line

with open("test.txt") as fi:
    cmp = {}
    for R in csv.reader(fi, delimiter='\t'):
        for L in ligands:
            newR = R[0], R[1]
            if R[0] == L:
                if (int(R[2]) <= int(1000) and int(R[2]) != int(0) and float(R[3]) < float("1.0e-10")):
                    if newR in cmp:
                        if float(cmp[newR][3]) > float(R[3]):
                            cmp[newR] = R[:-2]
                    else:
                        cmp[newR] = R[:-2]

Answer 1

也许沿着这条线尝试一下......

from bisect import insort

nth_lowest = [very_high_value] * 30

for x in my_loop:
    do_stuff()
    ...
    if x < nth_lowest[-1]:
        insort(nth_lowest, x)
        nth_lowest.pop() # remove the highest element

Python：保留csv.reader的第N个结果

1 个答案: