我正在尝试使用logstash
中的grok过滤器对我的日志进行结构过滤。
这是一个示例日志:
5d563f04-b5d8-4b8d-b3ac-df26028c3719 SoapRequest CheckUserPassword <?xml version=\"1.0\" encoding=\"utf-8\"?><soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"><soap:Body><CheckUserPassword xmlns=\"http://users.tvinci.com/\"><sWSUserName>users_199</sWSUserName><sWSPassword>11111</sWSPassword><sUserName>test</sUserName><sPassword>123456</sPassword><bPreventDoubleLogins>false</bPreventDoubleLogins></CheckUserPassword></soap:Body></soap:Envelope>
这是我的过滤器grok匹配模式:
%{DATA:method_id} %{WORD:method_type} %{WORD:method} %{GREEDYDATA:data}
我收到的结构是:
"method_id" => "963ad634-92d6-4a6c-9e6b-ef57e6bcd374",
"method_type" => "SoapRequest",
"method" => "CheckUserPassword",
"data" => " <?xml version=\"1.0\" encoding=\"utf-8\"?><soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"><soap:Body><CheckUserPassword"
除了数据字段之外,正确的结构是什么,这里我希望看到整个SOAP XML(你可以看到它在中间切割)
有什么建议吗?
答案 0 :(得分:2)
使用以下过滤器:
mutate {
gsub => [
"message", "\n", "", # Unix newline
"message", "\r", "", # OS X newline
"message", "\r\n", "" # Windows newline
]
}
答案 1 :(得分:0)
实际上,你的模式应该可行。在配置中一定是其他错误。请尝试以下:
logstash.config
input {
stdin {
}
}
filter {
grok {
match => [ "message", "%{DATA:method_id} %{WORD:method_type} %{WORD:method} %{GREEDYDATA:data}" ]
}
}
output {
stdout { debug => true }
}
$ java -jar logstash-1.2.1-flatjar.jar agent -f logstash.conf
5d563f04-b5d8-4b8d-b3ac-df26028c3719 SoapRequest CheckUserPassword <?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><CheckUserPassword xmlns="http://users.tvinci.com/"><sWSUserName>users_199</sWSUserName><sWSPassword>11111</sWSPassword><sUserName>test</sUserName><sPassword>123456</sPassword><bPreventDoubleLogins>false</bPreventDoubleLogins></CheckUserPassword></soap:Body></soap:Envelope>
{
"message" => "5d563f04-b5d8-4b8d-b3ac-df26028c3719 SoapRequest CheckUserPassword <?xml version=\"1.0\" encoding=\"utf-8\"?><soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"><soap:Body><CheckUserPassword xmlns=\"http://users.tvinci.com/\"><sWSUserName>users_199</sWSUserName><sWSPassword>11111</sWSPassword><sUserName>test</sUserName><sPassword>123456</sPassword><bPreventDoubleLogins>false</bPreventDoubleLogins></CheckUserPassword></soap:Body></soap:Envelope>\r",
"@timestamp" => "2013-10-26T04:01:19.386Z",
"@version" => "1",
"host" => "NYCL530",
"method_id" => "5d563f04-b5d8-4b8d-b3ac-df26028c3719",
"method_type" => "SoapRequest",
"method" => "CheckUserPassword",
"data" => "<?xml version=\"1.0\" encoding=\"utf-8\"?><soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"><soap:Body><CheckUserPassword xmlns=\"http://users.tvinci.com/\"><sWSUserName>users_199</sWSUserName><sWSPassword>11111</sWSPassword><sUserName>test</sUserName><sPassword>123456</sPassword><bPreventDoubleLogins>false</bPreventDoubleLogins></CheckUserPassword></soap:Body></soap:Envelope>\r"
}
答案 2 :(得分:0)
如果您有其他海报建议的新行问题
从我在互联网上看到的帖子看来,使用输入文件字段上的多行编解码器选项也解决了这个问题。
在下面的例子中,每当我在一行的最开头看到TIMESTAMP_ISO8601时,我声明它是一个新的&#34;条目,&#34;直到行开头的下一个时间戳是该条目的一部分。
input {
file {
path => "/var/elasticsearch-input/Log.log"
type => "log4netLog"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
}
}
在您的情况下,您需要为GUID编写正则表达式并将其作为模式放在那里。它可能看起来像这样,但我并不是100%肯定,因为我还没有测试过它。
input {
file {
path => "/var/elasticsearch-input/Log.log"
type => "log4netLog"
codec => multiline {
pattern => "^%{UUID} "
negate => true
what => previous
}
}
}
文档:https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html 我从(在配置logstash - intputs下)中提取它的文章:http://www.ben-morris.com/using-logstash-elasticsearch-and-log4net-for-centralized-logging-in-windows/