我有一个非常大的.csv文件,看起来像:
transcript_id,C3_MAR10,C4_APR10,CRL_2APR10,CRL_1_15JUL11,CRL_2_15JUL11,C1_OCT09,CRL_6OCT11,CRL_13DEC11,CRL_3DEC11,LRV6OCT11_A,LRV6OCT11_B
comp1000201_c0_seq1,0,5,0,0,0,0,0,0,0,0,0
comp1000297_c0_seq1,0,7,0,0,0,0,15,7,0,0,0
comp100036_c0_seq1,1,0,0,0,0,0,10,0,0,0,0
comp10003_c1_seq1,0,2,0,0,0,0,0,0,0,0,0
comp100041_c0_seq1,0,3,0,0,0,0,4,0,0,0,0
comp100041_c0_seq2,0,0,0,0,0,0,0,19,0,0,0
comp100041_c0_seq3,0,0,0,0,0,0,0,0,0,0,0
我想过滤/删除所有值为0的所有行,但不考虑第一列。当然我想要第一列(我的剩余行的成绩单ID)在我的输出文件中。
我试图使用:
sed '/[^0,]/!d' file.csv > filtered_file.csv
但是我没有过滤任何东西,因为我的所有第一个条目都是≠0。我不知道怎么说'我只想从第2列到第12列过滤'。
有什么建议吗?
谢谢!
答案 0 :(得分:2)
尝试这个
awk -F',' 'NR>1{for (i=2;i<=NF;i++){sum +=$i}if (sum>0) print $0;sum=0}' csv
答案 1 :(得分:1)
您可以尝试以下内容:
awk -F, 'NR>1{for(i=2;i<=NF;i++)if($i!=0){print $0;break}else continue;next}1' csv
<强>输出:强>
$ awk -F, 'NR>1{for(i=2;i<=NF;i++)if($i!=0){print $0;break}else continue;next}1' csv
transcript_id,C3_MAR10,C4_APR10,CRL_2APR10,CRL_1_15JUL11,CRL_2_15JUL11,C1_OCT09,CRL_6OCT11,CRL_13DEC11,CRL_3DEC11,LRV6OCT11_A,LRV6OCT11_B
comp1000201_c0_seq1,0,5,0,0,0,0,0,0,0,0,0
comp1000297_c0_seq1,0,7,0,0,0,0,15,7,0,0,0
comp100036_c0_seq1,1,0,0,0,0,0,10,0,0,0,0
comp10003_c1_seq1,0,2,0,0,0,0,0,0,0,0,0
comp100041_c0_seq1,0,3,0,0,0,0,4,0,0,0,0
comp100041_c0_seq2,0,0,0,0,0,0,0,19,0,0,0
答案 2 :(得分:0)
这是另一种方法:
awk '{f=$0;sub(/[^,]*/,"",f);gsub(/,/,"",f)} f' file
transcript_id,C3_MAR10,C4_APR10,CRL_2APR10,CRL_1_15JUL11,CRL_2_15JUL11,C1_OCT09,CRL_6OCT11,CRL_13DEC11,CRL_3DEC11,LRV6OCT11_A,LRV6OCT11_B
comp1000201_c0_seq1,0,5,0,0,0,0,0,0,0,0,0
comp1000297_c0_seq1,0,7,0,0,0,0,15,7,0,0,0
comp100036_c0_seq1,1,0,0,0,0,0,10,0,0,0,0
comp10003_c1_seq1,0,2,0,0,0,0,0,0,0,0,0
comp100041_c0_seq1,0,3,0,0,0,0,4,0,0,0,0
comp100041_c0_seq2,0,0,0,0,0,0,0,19,0,0,0
测试所有值是否与0