TL:DR - 我的有效JSON日志被Logstash拒绝,因为有些转义字符导致JSON无效。不幸的是,我无法弄清问题是什么。
完整版:
我的Logstash(+ Logstash-Forwarder)从自定义日志格式中获取Apache日志,其格式如下:
LogFormat "{ \
\"@timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \
\"@version\": \"1\", \
\"vhost\":\"%V\", \
\"tags\":[\"apache-json\"], \
\"message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \
\"clientip\": \"%a\", \
\"duration\": %D, \
\"status\": %>s, \
\"request\": \"%U%q\", \
\"urlpath\": \"%U\", \
\"urlquery\": \"%q\", \
\"bytes\": %B, \
\"method\": \"%m\", \
\"referer\": \"%{Referer}i\", \
\"useragent\": \"%{User-agent}i\" \
}" ls_apache_json
相关的Logstash输入/过滤器配置非常简单:
input {
lumberjack {
port => 5000
type => "logs"
}
}
filter {
if [type] =~ /-json$/ {
json {
source => "message"
}
}
}
似乎仍然可以将某些日志解析为JSON - 我收到了很多这些错误:
{
:timestamp=>"2015-08-27T12:47:05.165000+0200",
:message=>"Trouble parsing json",
:source=>"message",
:raw=>"{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\" }",
:exception=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120) at [Source: [B@29a15d70; line: 1, column: 231]>
:level=>:warn
}
我用一个简单的Ruby片段仔细检查了原始消息,这可以解析JSON:
# irb
$ require 'json'
$ s= "{ \t\"@timestamp\": \"2015-08-27T12:47:02+0200\", \t\"@version\": \"1\", \t\"vhost\":\"www.example.org\", \t\"tags\":[\"apache-json\"], \"clientip\": \"127.0.0.1\", \t\"duration\": 1280, \t\"status\": 200, \t\"request\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlpath\": \"/uploads/_processed_/csm_D\\xc3\\xa4mpfungswanne_3_01_0c75c517e4.jpg\", \t\"urlquery\": \"\", \t\"bytes\": 2913, \t\"method\": \"GET\", \t\"referer\": \"http://www.example.org/file.html\", \t\"useragent\": \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36\" }"
$ JSON.parse(s)
=> {"@timestamp"=>"2015-08-27T12:47:02+0200", "@version"=>"1", "vhost"=>"www.example.org", "tags"=>["apache-json"], "clientip"=>"127.0.0.1", "duration"=>1280, "status"=>200, "request"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlpath"=>"/uploads/_processed_/csm_Dxc3xa4mpfungswanne_3_01_0c75c517e4.jpg", "urlquery"=>"", "bytes"=>2913, "method"=>"GET", "referer"=>"http://www.example.org/file.html", "useragent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"}
我正在运行Logstash 1.5.2。我认为这与编解码器有关,因为我试图为json解析器设置codec
参数并没有解决问题。 Apache除了确保使用正确的编解码器之外还有什么需要 - 我似乎无法找到任何配置选项:(
欢迎任何帮助。
答案 0 :(得分:0)
#replace " and \
ruby {
code => 'str=event["request_body"]; str=str.gsub("\\x22","\"").gsub("\\x5C", "\\"); event["request_body"]=str;'
}