如何在至少2列中获取值大于2的行?

时间:2017-04-07 18:06:34

标签: awk

我试图在至少两列中提取值为> = 2的行。我的输入文件看起来像这样

gain,top1,sos1,pho1
ATC1,0,0,0
ATC2,1,2,1
ATC3,6,6,0
ATC4,1,1,2

我的awk脚本看起来像这样

cat input_file | awk 'BEGIN{FS=",";OFS=","};{count>=0;for(i=2; i<4; i++) {if($i!=0) {count++}};if (count>=2){print $0}}'

它没有给我预期的输出

gain,top1,sos1,pho1
ATC3,6,6,0

此脚本有什么问题。感谢。

3 个答案:

答案 0 :(得分:3)

array([[1, 1],
   [1, 2],
   [1, 3],
   [2, 1],
   [2, 2],
   [2, 3],
   [3, 1],
   [3, 2],
   [3, 3]])

或低于1,打印并在找到2个值(合理地更快)后立即转到下一行

awk -F, 'FNR>1{f=0; for(i=2; i<=NF; i++)if($i>=2)f++}f>=2 || FNR==1' file

<强>解释

awk -F, 'FNR>1{f=0; for(i=2; i<=NF; i++){ if($i>=2)f++; if(f>=2){ print; next} } }FNR==1' file

<强>输入

awk -F, '                                # call awk and set field separator as comma
         FNR>1{                          # we wanna skip header to be checked so, if no of records related to current file is greater than 1
                 f=0;                    # set variable f = 0
                 for(i=2; i<=NF; i++)    # start looping from second field to no of fields in record/line/row
                 { 
                    if($i>=2)f++;        # if field value is greater than 2 increment variable f
                    if(f>=2)             # if we got 2 values ? then
                    { 
                       print;            # print record/line/row
                       next              # we got enough go to next line
                    } 
                 } 
               }FNR==1                   # if first record being read then print in fact if FNR==1 we get boolean true, so it does default operation print $0, that is current record/line/row
        ' file

<强>输出-1

$ cat file
gain,top1,sos1,pho1
ATC1,0,0,0
ATC2,1,2,1
ATC3,6,6,0
ATC4,1,1,2

输出-2 (合理地更快)

$ awk -F, 'FNR>1{f=0; for(i=2; i<=NF; i++)if($i>=2)f++}f>=2 || FNR==1' file
gain,top1,sos1,pho1
ATC3,6,6,0

答案 1 :(得分:0)

hacky awk,也处理标题

$ awk -F, '($2>=2) + ($3>=2) + ($4>=2) > 1' file

gain,top1,sos1,pho1
ATC3,6,6,0

,或者

$ awk -F, 'function ge2(x) {return x>=2?1:0}  
           ge2($2) + ge2($3) + ge2($4) > 1' file

gain,top1,sos1,pho1
ATC3,6,6,0

答案 2 :(得分:0)

@pali:@try: 希望这应该快得多。

awk '{Q=$0;}(gsub(/,[2-9]/,"",Q)>=2) || FNR==1'  Input_file

这里我将line的值放入一个名为Q的变量中,然后从Q变量全局替换所有匹配,然后将2到9的数字替换为NULL。然后检查它的计数是否大于或等于2,如果它的全局替换值大于2或行号为1则应该打印当前行。