给出以下输入:
111.111.111.111,11.11.11.11 - 11.11.11.11 [06 / May / 2016:08:26:10 +0000]“POST / some-service / GetSomething HTTP / 1.1”499 0“ - ”“Jakarta Commons- HttpClient / 3.1“”7979798797979799“59.370 - ”{\ x0A \ x22correlationId \ x22:\ x22TestCorr1 \ x22 \ x0A}“
使用Logstash配置:
input { stdin {} }
output { stdout { codec => "rubydebug" } }
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG} %{QS:partner_id} %{NUMBER:req_time} %{GREEDYDATA:extra_fields}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
mutate {
gsub => ["extra_fields", "\\x0A","","extra_fields", "\\x22",'\"']
}
json {
source => "extra_fields"
target => "extra_fields_json"
}
mutate {
add_field => {
"correlationId" => "%{[extra_fields_json][correlationId]}"
}
}
}
给我们以下输出:
Logstash startup completed { "message" => "111.111.111.111, 11.11.11.11 - 11.11.11.11 [06/May/2016:08:26:10 +0000] "POST /some-service/GetSomething HTTP/1.1" 499 0 "-" "Jakarta Commons-HttpClient/3.1" "8888888" 59.370 - "{\x0A\x22correlationId\x22 : \x22TestCorr1\x22\x0A}", "@version" => "1", "@timestamp" => "2016-06-23T18:00:28.831Z", "clientip" => "11.11.11.11", "ident" => "-", "auth" => "10.0.12.205", "timestamp" => "06/May/2016:08:26:10 +0000", "verb" => "POST", "request" => "/some-service/GetSomething", "httpversion" => "1.1", "response" => "499", "bytes" => "0", "referrer" => "\"-\"", "agent" => "\"Jakarta Commons-HttpClient/3.1\"", "partner_id" => "\"8888888\"", "req_time" => [ [0] "59.370", [1] "1" ], "extra_fields" => "\"{\\\"correlationId\\\" : \\\"TestCorr_1\\\"}\"", "extra_fields_json" => "{\"correlationId\" : \"TestCorr_1\"}", "correlationId" => "correlationId" } Logstash shutdown completed
为什么“correlationId”的值是“correlationId”而不是“TestCorr_1”???
感谢您的帮助。谢谢!
答案 0 :(得分:0)
您必须从"
中删除额外的\
和extra_fields
才能正确解析json。
您必须将mutate过滤器更改为:
mutate {
gsub => ["extra_fields", "\"","",
"extra_fields", "\\x0A","",
"extra_fields", "\\x22",'\"',
"extra_fields", "(\\)",""
]
}
如果有效/无效,请不要反馈