我试图在至少两列中提取值为> = 2的行。我的输入文件看起来像这样
gain,top1,sos1,pho1
ATC1,0,0,0
ATC2,1,2,1
ATC3,6,6,0
ATC4,1,1,2
我的awk脚本看起来像这样
cat input_file | awk 'BEGIN{FS=",";OFS=","};{count>=0;for(i=2; i<4; i++) {if($i!=0) {count++}};if (count>=2){print $0}}'
它没有给我预期的输出
gain,top1,sos1,pho1
ATC3,6,6,0
此脚本有什么问题。感谢。
答案 0 :(得分:3)
array([[1, 1],
[1, 2],
[1, 3],
[2, 1],
[2, 2],
[2, 3],
[3, 1],
[3, 2],
[3, 3]])
或低于1,打印并在找到2个值(合理地更快)后立即转到下一行
awk -F, 'FNR>1{f=0; for(i=2; i<=NF; i++)if($i>=2)f++}f>=2 || FNR==1' file
<强>解释强>
awk -F, 'FNR>1{f=0; for(i=2; i<=NF; i++){ if($i>=2)f++; if(f>=2){ print; next} } }FNR==1' file
<强>输入强>
awk -F, ' # call awk and set field separator as comma
FNR>1{ # we wanna skip header to be checked so, if no of records related to current file is greater than 1
f=0; # set variable f = 0
for(i=2; i<=NF; i++) # start looping from second field to no of fields in record/line/row
{
if($i>=2)f++; # if field value is greater than 2 increment variable f
if(f>=2) # if we got 2 values ? then
{
print; # print record/line/row
next # we got enough go to next line
}
}
}FNR==1 # if first record being read then print in fact if FNR==1 we get boolean true, so it does default operation print $0, that is current record/line/row
' file
<强>输出-1 强>
$ cat file
gain,top1,sos1,pho1
ATC1,0,0,0
ATC2,1,2,1
ATC3,6,6,0
ATC4,1,1,2
输出-2 (合理地更快)
$ awk -F, 'FNR>1{f=0; for(i=2; i<=NF; i++)if($i>=2)f++}f>=2 || FNR==1' file
gain,top1,sos1,pho1
ATC3,6,6,0
答案 1 :(得分:0)
hacky awk
,也处理标题
$ awk -F, '($2>=2) + ($3>=2) + ($4>=2) > 1' file
gain,top1,sos1,pho1
ATC3,6,6,0
,或者
$ awk -F, 'function ge2(x) {return x>=2?1:0}
ge2($2) + ge2($3) + ge2($4) > 1' file
gain,top1,sos1,pho1
ATC3,6,6,0
答案 2 :(得分:0)
@pali:@try: 希望这应该快得多。
awk '{Q=$0;}(gsub(/,[2-9]/,"",Q)>=2) || FNR==1' Input_file
这里我将line的值放入一个名为Q的变量中,然后从Q变量全局替换所有匹配,然后将2到9的数字替换为NULL。然后检查它的计数是否大于或等于2,如果它的全局替换值大于2或行号为1则应该打印当前行。