如何从shell脚本.sh文件中的json日志文件中读取并通过计算日志文件中的出现次数来获取分析数据

时间:2016-08-02 11:29:45

标签: bash shell jq

需要从json日志文件中读取数据,并使用shell脚本从中获取分析数据。

日志文件包含json,如下所示:

{
info: 'label1',
description: 'some desc',
timestamp: '2016-07-27T06:24:50.335Z'
}
{
info: 'label2',
description: 'some desc',
timestamp: '2016-07-27T06:24:50.335Z'
}
{
info: 'label2',
description: 'some desc',
timestamp: '2016-07-27T06:24:50.335Z'
}
{
info: 'label2',
description: 'some desc',
timestamp: '2016-07-29T06:24:50.335Z'
}
{
info: 'label3',
description: 'some desc',
timestamp: '2016-07-29T06:24:50.335Z'
}

我需要如下结果(使用shell脚本):

Labels    Date                  Count

label1   2016-07-27             1
label2   2016-07-27             2              
label2   2016-07-29             1
label3   2016-07-29             1

这是我可以去的,需要一些关于如何接近的建议。\

#!/bin/bash
my_dir=`dirname $0`
file="out.log"
#keysFile="$my_dir/keys.txt"
for log in $(cat $file | jq '{id: .info,time: .timestamp}'); do
#This is as far as I could get. I was able to read the data in the form of {id: 'label1', time: '2016-07-27T06:24:50.335Z' }
#Now I need to somehow create a key value thing in shell and store timestamp / label as key and increment the count
echo $log
done

2 个答案:

答案 0 :(得分:0)

根据您的输入,您可以使用以下命令将其创建为csv数据:

$ jq -rs '
def to_csv($headers):
    def _object_to_csv:
        ($headers | @csv),
        (.[] | [.[$headers[]]] | @csv);
    def _array_to_csv:
        ($headers | @csv),
        (.[][:$headers|length] | @csv);
    if .[0]|type == "object" then
        _object_to_csv
    else
        _array_to_csv
    end;
map({ label: .info, date: .timestamp[:10] })
    | group_by(.)
    | map(.[0] + { count: length })
    | to_csv(["label", "date", "count"])
' input.json

这会产生:

"label","date","count"
"label1","2016-07-27",1
"label2","2016-07-27",2
"label2","2016-07-29",1
"label3","2016-07-29",1

答案 1 :(得分:0)

以下是一种使用 reduce 而非 group_by 的方法。

假设您的数据位于out.log,以及filter.jq

中的以下过滤条件
["Labels", "Date", "Count"], 
["", "", ""], 
(
  reduce .[] as $r (
      {}
    ; [$r.info, $r.timestamp[0:10]] as $p
    | setpath($p;getpath($p)+1)
  )
  | tostream
  | select(length==2)
  | flatten
)
| @tsv

你可以运行

jq -M -s -r -f filter.jq out.log

生成制表符分隔的输出

Labels  Date    Count

label1  2016-07-27  1
label2  2016-07-27  2
label2  2016-07-29  1
label3  2016-07-29  1