根据价值bash获取计数

时间:2018-06-09 10:13:43

标签: json bash shell awk jq

我在文件中有这种格式的数据:

{"field1":249449,"field2":116895,"field3":1,"field4":"apple","field5":42,"field6":"2019-07-01T00:00:10","metadata":"","frontend":""}
{"field1":249448,"field2":116895,"field3":1,"field4":"apple","field5":42,"field6":"2019-07-01T00:00:10","metadata":"","frontend":""}
{"field1":249447,"field2":116895,"field3":1,"field4":"apple","field5":42,"field6":"2019-07-01T00:00:10","metadata":"","frontend":""}
{"field1":249443,"field2":116895,"field3":1,"field4":"apple","field5":42,"field6":"2019-07-01T00:00:10","metadata":"","frontend":""}
{"field1":249449,"field2":116895,"field3":1,"field4":"apple","field5":42,"field6":"2019-07-01T00:00:10","metadata":"","frontend":""}

这里,每个条目代表一行。我希望根据字段1中的值计算行数,例如:

249449 : 2
249448 : 1
249447 : 1
249443 : 1

我怎么能得到它?

4 个答案:

答案 0 :(得分:3)

awk

$ awk -F'[,:]' -v OFS=' : ' '{a[$2]++} END{for(k in a) print k, a[k]}' file

答案 1 :(得分:2)

您可以使用jq命令行工具来解释JSON数据。 uniq -c计算出现次数。

% jq .field1 < $INPUTFILE | sort | uniq -c
      1 249443
      1 249447
      1 249448
      2 249449

(使用zsh在linux xubuntu 18.04上使用jq 1.5-1-a5b5cbe测试)

答案 2 :(得分:0)

这是一个有效的jq唯一解决方案:

reduce inputs.field1 as $x ({}; .[$x|tostring] += 1)
| to_entries[]
| "\(.key) : \(.value)"

调用:jq -nrf program.jq input.json

(特别注意-n选项。)

当然,如果计数的对象表示令人满意,那么 人们可以简单地写一下:

jq -n 'reduce inputs.field1 as $x ({}; .[$x|tostring] += 1)' input.json

答案 3 :(得分:0)

使用datamash和一些shell utils,将非数据分隔符更改为压缩选项卡,计算字段3,(它是字段2,但是有一个前导选项卡),反向,然后按照每个 OP 规范:

tr -s '{":,}' '\t' < file | datamash -sg 3 count 3 | tac | xargs printf '%s : %s\n'

输出:

249449 : 2
249448 : 1
249447 : 1
249443 : 1