在第3列中,值是hour,我想在0到23 hrs之间打印标题,并在第3列中将hr重复的次数计数。如果找不到hr值,则打印0。
输入文件
123 3 3
122 3 3
122 4 4
122 3 4
122 4 4
122 5 5
122 3 12
122 4 15
122 5 20
122 5 20
所需的输出
第一行=标头0到23小时,由
分隔第二行=每个小时找到的值,如果找不到,则打印0。
0,1,2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18,19,20,21,22,23
0,0,0,2,3,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,2,0,0,0
要按小时计数,我尝试了
awk '{a[$3]++} END {for(i in a) print i, a[i]}'
谢谢。
答案 0 :(得分:3)
$ awk '
{ a[$3]++ } # hash them
END {
for(i=0;i<=23;i++) { # loop the hours
b=b (b==""?"":",") i # collect hours to b
c=c (c==""?"":",") (a[i]?a[i]:0) # and counts to c
}
print b ORS c # output them
}' file
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
0,0,0,2,3,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,2,0,0,0
答案 1 :(得分:3)
另一个awk
$ awk '{a[$3]++}
END{while(i<24)
{h1=h1 s i+0;
h2=h2 s a[i++]+0;
s=","}
print h1 ORS h2}' file
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
0,0,0,2,3,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,2,0,0,0
ps。看起来像@JamesBrown答案的变体。
答案 2 :(得分:2)
请您尝试以下。
awk '
BEGIN{
OFS=","
for(i=0;i<=23;i++){
printf("%d%s",i,i==23?ORS:OFS)
}
}
{
a[$3]++
}
END{
for(j=0;j<=23;j++){
printf("%d%s",a[j],j==23?ORS:OFS)
}
}' Input_file
输出如下。
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
0,0,0,2,3,1,0,0,0,0,0,0,1,0,0,1,0,0,0,0,2,0,0,0
答案 3 :(得分:2)
稍微更改for循环:
async
收件人:
for(i in a) print i, a[i]
将输出移至for(i=0; i<=23; i++) print i, a[i]+0
:
rs
输出:
awk ... | rs -c' ' -T
如果您对csv输出有把握,请在末尾使用0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
0 0 0 2 3 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 2 0 0 0
:
tr
输出:
awk '{a[$3]++} END {for(i=0;i<=23;i++) print i, a[i]+0}' | rs -c' ' -T | tr -s ' ' ,