早上好!
我有一个140行26列的file.csv。我需要根据第23列中的值对行进行排序。这是一个示例:
Controller1,NA,ASHEBORO,ASH,B,,3674,4572,1814,3674,4572,1814,1859,#NAME?,0,124.45%,49.39%,19%,1,,"Big Risk, No Spare disk",45.04%,4.35%,12.63%,160,464,,,,,,0,1,1,1,0,410,65%,1.1,1.1,1.3,0.65,0.65,0.75,0.04,0.1,,,,,,,,,
Controller2,EU,FR,URG,D,,0,0,0,0,0,0,0,#NAME?,0,#DIV/0!,#DIV/0!,#DIV/0!,1,,#N/A,0.00%,0.00%,#DIV/0!,NO STATS,-1088,,,,,,#N/A,#N/A,#N/A,#N/A,0,#N/A,65%,1.1,1.1,1.3,0.65,0.65,0.75,0.04,0.1,,,,,,,,,
Controller3,EU,FR,URG,D,,0,0,0,0,0,0,0,#NAME?,0,#DIV/0!,#DIV/0!,#DIV/0!,1,,#N/A,0.00%,0.00%,#DIV/0!,NO STATS,-2159,,,,,,#N/A,#N/A,#N/A,#N/A,0,#N/A,65%,1.1,1.1,1.3,0.65,0.65,0.75,0.04,0.1,,,,,,,,,
Controller4,NA,STARR,STA,D,,4430,6440,3736,4430,6440,3736,693,#NAME?,0,145.38%,84.35%,18%,1,,No more Data disk,65.17%,19.18%,-2.18%,849,-96,,,,,,0,2,1,2,2,547,65%,1.1,1.1,1.3,0.65,0.65,0.75,0.04,0.1,,,,,,,,,
要根据第23列的值对行进行排序,请执行以下操作:
awk -F "%*," '$23 > 4' myfikle.csv
结果:
Controller1,NA,ASHEBORO,ASH,B,,3674,4572,1814,3674,4572,1814,1859,#NAME?,0,124.45%,49.39%,19%,1,,"Big Risk, No Spare disk",45.04%,4.35%,12.63%,160,464,,,,,,0,1,1,1,0,410,65%,1.1,1.1,1.3,0.65,0.65,0.75,0.04,0.1,,,,,,,,,
Controller4,NA,STARR,STA,D,,4430,6440,3736,4430,6440,3736,693,#NAME?,0,145.38%,84.35%,18%,1,,No more Data disk,65.17%,19.18%,-2.18%,849,-96,,,,,,0,2,1,2,2,547,65%,1.1,1.1,1.3,0.65,0.65,0.75,0.04,0.1,,,,,,,,,
在我的示例中,我在第23列中使用了4%的值,目标是检索所有以%为单位的值的行,该值在第23列中显着增加。问题是我无法基于因为它仅代表当前表,所以为4%的值。因此,我必须找到另一种方法来检索第23列中具有较高值的行。
我必须根据第23列中的百分比对控制器进行降序排序,我更喜欢处理已排序行的前10%,以确保我拥有的控制器百分比很大。
目标是能够根据表中的行数更改百分比。
您对此有什么建议吗?
谢谢! :)
答案 0 :(得分:1)
如果要使用标准工具,则需要两次读取文件。但是,如果您愿意使用perl,则可以执行以下操作:
perl -e 'my @sorted = sort <>; print @sorted[0..$#sorted * .10]' input-file
答案 1 :(得分:0)
我可能发誓这个问题是重复的,但是到目前为止我找不到类似的问题。
文件是否排序并不重要。您可以从任何文件中使用NUMBER
提取head -n NUMBER
的第一行。没有内置的方法可以按百分比指定数字,但是您可以计算出PERCENT
%的文件行是NUMBER
行。
percentualHead() {
percent="$1"
file="$2"
linesTotal="$(wc -l < "$file")"
(( lines = linesTotal * percent / 100 ))
head -n "$lines" "$file"
}
或更短但可读性较低
percentualHead() {
head -n "$(( "$(wc -l < "$2")" * "$1" / 100 ))" "$2"
}
呼叫percentualHead 10 yourFile
将打印从yourFile
到标准输出的前10%行。
请注意,percentualHead
仅适用于文件,因为该文件必须被读取两次。它不适用于FIFO,<()
或管道。
答案 2 :(得分:0)
这是GNU awk从文件中获取最高 p %的一种,但它们按照出现的顺序输出:
$ awk -F, -v p=0.5 ' # 50 % of top $23 records
NR==FNR { # first run
a[NR]=$23 # hash precentages to a, NR as key
next
}
FNR==1 { # second run, at beginning
n=asorti(a,a,"@val_num_desc") # sort percentages to descending order
for(i=1;i<=n*p;i++) # get only the top p %
b[a[i]] # hash their NRs to b
}
(FNR in b) # top p % BUT not in order
' file file | cut -d, -f 23 # file processed twice, cut 23rd for demo
45.04%
19.18%
对此发表评论。