我有一个如下文件:
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 1.025 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD
我想打印第6个字段中值最高的行,而所有其他字段都是唯一的。
期望的输出:
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD
在awk中有一种聪明的方法吗?
答案 0 :(得分:1)
明智的方法是使用sort + awk:
$ sort -k6,6nr file | awk '!seen[$1,$2,$3,$4,$5,$7]++'
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD
但如果您想使用awk,您可以这样做:
$ awk '
{ orig=$0; $6=""; key=$0; $0=orig }
NR==FNR{ if ( !(key in max) || $6 > max[key] ) { max[key]=$6; nr[key]=NR } next }
nr[key]==FNR
' file file
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD
答案 1 :(得分:0)
如果您不希望字段与所需的输出相符,
awk '{if(uniqueSet[$1" "$2" "$3" "$4" "$5" "$7] < $6) { uniqueSet[$1" "$2" "$3" "$4" "$5" "$7] = $6} }END{for(i in uniqueSet){print i" "uniqueSet[i]} }' <input_file_name>
会给,
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L SETUP 0.867568
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L HOLD 0.850877
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L HOLD 0.85125
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L SETUP 2.3
如果你想坚持字段的顺序,
awk '{if(uniqueSet[$1" "$2" "$3" "$4" "$5" "$7] < $6) { uniqueSet[$1" "$2" "$3" "$4" "$5" "$7] = $6} }END{for(i in uniqueSet){ split(i, ar, " "); print ar[1]" "ar[2]" "ar[3]" "ar[4]" "ar[5]" "uniqueSet[i]" "ar[6]} }' <input_file_name>
会给,
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
答案 2 :(得分:0)
在GNU awk中:
$ gawk ' {
t=$6 # put $6 to temp
$6="MARK" # replace it with a marker, use $0 as key
if($0 in v==0 || t>v[$0]) { # if $0 not in value hash or t>previous value
a[$0]=NR # in a goes the record number for ordering
v[$0]=t
}
}
END { # in the end
PROCINFO["sorted_in"]="@val_num_asc" # traverse a in growing order of NRs stored
for(i in a) {
sub(/MARK/,v[i],i) # replace mark with value
print i # and output
}
}' file
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD
答案 3 :(得分:0)
GNU datamash
+ cut
工具的简短替代方案:
datamash -Wf -g1,2,3,4,5,7 max 6 <file | cut -f1-7 --output-delimiter=' '
输出:
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.867568 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 2.3 SETUP
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] HIGH H2L 0.85125 HOLD
scale_check BANK0_F2_WRDAT_P0[0] MCLK[0] LOW H2L 0.850877 HOLD