我有一个像这样的大文本文件:
小例子:
chr4 53382 53385 47 chr4 53382 53385 ZNF595 ENST00000509152.2 annotated
chr16 103500 103550 27 chr16 103475 103586 POLR3K ENST00000293860.5 annotated
chr16 103550 103586 43 chr16 103475 103586 POLR3K ENST00000293860.5 annotated
chr16 103584 103600 43 chr16 103584 104058 SNRNP25 ENST00000293861.3 annotated
chr16 103900 103950 37 chr16 103584 104058 SNRNP25 ENST00000293861.3 annotated
我想根据第8列对行进行分组,并对属于同一组的行中第4列的值求和。
我尝试了以下代码:
b = pd.read_csv("myfile.txt", sep='\t')
df = d.groupby(7)(3).sum()
df.to_csv('outfile.txt', sep='\t', index=None)
chr4 53382 53385 47 chr4 53382 53385 ZNF595 ENST00000509152.2 annotated
chr16 103550 103586 70 chr16 103475 103586 POLR3K ENST00000293860.5 annotated
chr16 103584 103600 80 chr16 103584 104058 SNRNP25 ENST00000293861.3 annotated