我想计算第3列中每个单词出现的次数。以下是输入
IN A three
US B one
LK C two
US B three
US A one
IN A one
US B three
LK C three
US B two
US A two
IN A two
US B two
输出应如下所示:
IN A three 4
US B one 3
LK C two 5
US B three 4
US A one 3
IN A one 3
US B three 4
LK C three 4
US B two 5
US A two 5
IN A two 5
US B two 5
答案 0 :(得分:5)
这可以是一种方式;
$ awk 'FNR==NR{++a[$3]; next} {print $0, a[$3]}' file file
IN A three 4
US B one 3
LK C two 5
US B three 4
US A one 3
IN A one 3
US B three 4
LK C three 4
US B two 5
US A two 5
IN A two 5
US B two 5
它循环遍历文件两次:首先获取数据,然后打印它。
FNR==NR{++a[$3]; next}
第一次循环时,跟踪第3个值出现的次数。{print $0, a[$3]}
第二次循环时,打印该行加上计数器值。要获得更好的输出,您还可以使用printf
在第3列之后打印标签:
{printf "%s\t%s\n", $0, a[$3]}