我已经浏览了很多类似的线程,但也许由于我对python缺乏了解,我在我的问题中找不到可行的解决方案。
这是代码的一部分:
for line in splitline:
if("Fam" in line):
if("NK" in line or "V" in line):
normaali = line.split()
normaalilista.append(normaali)
both.append(normaali)
if("TK" in line):
tumor = line.split()
tuumorilista.append(tumor)
both.append(tumor)
“both”的输出看起来像这个atm:
['Fam_c828_1', '12-0799NK', '100']
['Fam_c828_1', '12-0800TK', '100']
['Fam_s56_1', '12-0801TK', '100']
['Fam_s134_1', '12-0802NK', '100']
['Fam_s146_1', '12-0803TK', '100']
我想保留具有相同索引[0]值的行/单元格。就像在这种情况下:
['Fam_c828_1', '12-0799NK', '100']
['Fam_c828_1', '12-0800TK', '100']
,其余的将被删除到另一个列表。
提前致谢
答案 0 :(得分:1)
根据第一个以空格分隔的列的值对行进行分组:
from collections import defaultdict
d = defauldict(list) # index[0] -> line
for line in splitline:
columns = line.split()
d[columns[0]].append(columns)
答案 1 :(得分:1)
您可以使用itertools.groupby
:
>>> from itertools import groupby
>>> groups = groupby(both, lambda x: x[0]) # Group `both` by the zeroth index of its members
>>> group = next(groups) # Get the first group in groups
>>> group
('Fam_c828_1', <itertools._grouper object at 0x10f065d10>)
>>> list(group[1]) # Cast the group iterable into a list for display purposes
[['Fam_c828_1', '12-0799NK', '100'], ['Fam_c828_1', '12-0800TK', '100']]