Question

我有一个像

这样的数据文件

a   1
b   2
c   3 
d   4
a   5
b   6
c   7
d   6
etc

我想输出到新文件

a   average of 2nd column from all "a" rows
b   average of 2nd column from all "b" rows
etc

其中a，b，c ......也是数字。我已经能够使用awk对第1列的特定值（下例中的1.4）执行此操作：

awk '{  if ( $1 == 1.4) total += $2; count++ }
END {print total/10 }'  data

虽然count没有给我正确的行数（例如，计数应为10，因为我手动输入10来进行最后一行的平均值）。

我假设需要一个for循环，但我无法正确实现。请帮忙。感谢。

Answer 1

awk '{a[$1]+=$2;c[$1]++}END{for(x in a)printf "average of %s is %.2f\n",x,a[x]/c[x]}'

上面一行的输出（带有你的示例输入）是：

average of a is 3.00
average of b is 4.00
average of c is 5.00
average of d is 5.00