我的文件在每行中包含不同的值,我想计算在特定关键字之后出现的数字。例如;
"fields" : {
"referer" : [ "-" ],
"@timestamp" : [ "2017-01-08T19:50:19.000Z" ],
"uri_path" : [ "test" ],
"method" : [ "GET" ],
"servername" : [ "INMESPWEB03" ],
"useragent" : [ "Mediapartners-Google" ],
"querystring" : [ "test" ],
"bytes-sent" : [ "227905" ],
"cshost" : [ "www.test.com" ],
"scstatus" : [ "200" ],
"time-taken" : [ "15468" ]
}
"fields" : {
"referer" : [ "-" ],
"@timestamp" : [ "2017-01-08T19:50:19.000Z" ],
"uri_path" : [ "test" ],
"method" : [ "GET" ],
"servername" : [ "INMESPWEB03" ],
"useragent" : [ "Mediapartners-Google" ],
"querystring" : [ "test" ],
"bytes-sent" : [ "227905" ],
"cshost" : [ "www.test.com" ],
"scstatus" : [ "300" ],
"time-taken" : [ "15468" ]
}
"fields" : {
"referer" : [ "-" ],
"@timestamp" : [ "2017-01-08T19:50:19.000Z" ],
"uri_path" : [ "test" ],
"method" : [ "GET" ],
"servername" : [ "INMESPWEB03" ],
"useragent" : [ "Mediapartners-Google" ],
"querystring" : [ "test" ],
"bytes-sent" : [ "227905" ],
"cshost" : [ "www.test.com" ],
"scstatus" : [ "200" ],
"time-taken" : [ "15468" ]
}
所以结果应该是
就像这样
我想检查“scstatus”之后的每个数字并计算它们并按升序或降序打印。这是我到目前为止编写的代码,这个脚本给了我上面的数据
curl -XPOST 'webpage.name.abc' -d { "query": { "filtered": { "query": { "query_string": {
"analyze_wildcard": true,
"query": "useragent: \"googlebot\"|\"mediapartners-google\"|\"adsbot-google\""}
}}},"size": 4000000, "fields": ["@timestamp","servername","uri_path","scstatus","method","cshost","useragent","time-taken","referer","bytes-sent","querystring"]}
答案 0 :(得分:1)
如果你的文件格式是固定的,这个awk one-liner可能会有所帮助:
awk -F'"' '$2=="scstatus"{a[$4]++}END{for(x in a)print x,a[x]}' file
200 2
300 1