假设我有一个这样的文件:
13.03.2013 12:13:01 | STRING1 | NUMBER1 | 1 | NUMBER3
13.03.2013 12:13:08 | STRING1 | NUMBER1 | 12 | NUMBER3
13.03.2013 12:13:09 | STRING3 | NUMBER1 | 13 | NUMBER3
13.03.2013 12:13:12 | STRING2 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:13:15 | STRING2 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:13:18 | STRING1 | NUMBER1 | 13 | NUMBER3
13.03.2013 12:13:20 | STRING2 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:13:25 | STRING3 | NUMBER1 | 51 | NUMBER3
13.03.2013 12:13:38 | STRING2 | NUMBER1 | 71 | NUMBER3
13.03.2013 12:13:40 | STRING1 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:13:42 | STRING1 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:13:55 | STRING3 | NUMBER1 | 71 | NUMBER3
13.03.2013 12:14:02 | STRING1 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:14:07 | STRING1 | NUMBER1 | 13 | NUMBER3
13.03.2013 12:14:08 | STRING3 | NUMBER1 | 13 | NUMBER3
13.03.2013 12:14:15 | STRING2 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:14:16 | STRING2 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:14:16 | STRING1 | NUMBER1 | 1 | NUMBER3
13.03.2013 12:14:20 | STRING2 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:14:25 | STRING3 | NUMBER1 | 51 | NUMBER3
13.03.2013 12:14:37 | STRING2 | NUMBER1 | 71 | NUMBER3
13.03.2013 12:14:42 | STRING1 | NUMBER1 | 1 | NUMBER3
13.03.2013 12:14:45 | STRING1 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:14:58 | STRING3 | NUMBER1 | 51 | NUMBER3
13.03.2013 12:15:06 | STRING2 | NUMBER1 | 11 | NUMBER3
13.03.2013 12:15:13 | STRING1 | NUMBER1 | 43 | NUMBER3
13.03.2013 12:15:22 | STRING2 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:15:26 | STRING3 | NUMBER1 | 51 | NUMBER3
13.03.2013 12:15:35 | STRING2 | NUMBER1 | 71 | NUMBER3
13.03.2013 12:15:40 | STRING1 | NUMBER1 | 1 | NUMBER3
13.03.2013 12:15:42 | STRING1 | NUMBER1 | 21 | NUMBER3
13.03.2013 12:15:53 | STRING3 | NUMBER1 | 71 | NUMBER3
我想找到仅为变量|
的每分钟的第4列(第三X
之后)的平均值。例如,如果$X="STRING1"
结果应为:
13.03.2013 12:13 | STRING1 | 11.6
13.03.2013 12:14 | STRING1 | 7.4
13.03.2013 12:15 | STRING1 | 21.666
因此,我们正在查看变量$X
的每一分钟行,并计算这些行的平均值。怎么处理呢?
答案 0 :(得分:2)
您可以使用以下awk程序:
example.awk :
$0 ~ SEARCH {
split($1,time,":")
min=time[2]
total[min]+=$4
count[min]++
ts[min]=time[1]":"time[2]
}
END{
for(m in total){
printf "%s|%s|%s\n", ts[m],SEARCH,total[m]/count[m]
}
}
执行它:
awk -F'|' -v SEARCH=STRING1 -f example.awk your.log
输出:
13.03.2013 12:13|STRING1|11.6
13.03.2013 12:14|STRING1|7.4
13.03.2013 12:15|STRING1|21.6667
答案 1 :(得分:2)
awk -v X="STRING1" '
BEGIN { FS = OFS = "|" }
$2 != X {next}
{min = substr($1,1,16)}
min != prev {
if (NR>1) print prev, X, total/n
total = n = 0
prev = min
}
{n++; total += $4}
END {print prev, X, total/n}
' file