将XML文件解析为LogStash

时间:2015-07-11 05:45:38

标签: xml elasticsearch logstash

我有以下logstash配置文件:

input {  
file 
{
    path => "C:\Dashboard\Elmah\*.xml"
    start_position => "beginning"
    type => "error"
    codec => multiline 
    {
        pattern => "^<\?error .*\>"
        negate => true
        what => "previous"
    }
    sincedb_path => "C:\Dashboard\Elmah"
  }
}

filter 
{
    xml 
    {
        source => "error"
        xpath => 
        [
            "/error/@errorId", "ErrorId",
            "/error/@type", "Type",
            "/error/@message", "Message",
            "/error/@time", "Time",
            "/error/@user", "User"
        ]
        store_xml => true
    }
}

output 
{
    elasticsearch 
    { 
        action => "index"
        host => "localhost"
        index => "stock"
        workers => 1
    }
    stdout 
    {
        codec => rubydebug
    }
}

当我运行bin / logstash -f agent.conf时,我没有收到错误,但没有数据插入Elasticsearch。该文件的一个示例是: https://www.dropbox.com/s/6oni2zhorsdtz6p/error-2015-06-26203423Z-3026bd43-07d6-44d6-a6cf-6d27b28a607e.xml?dl=0

如何让Logstash读入外部xml文件集合?

LogStash调试输出:

Please see here: https://www.dropbox.com/s/g7g1154uvf9fr1f/outputlog2.txt?dl=0

2 个答案:

答案 0 :(得分:0)

我不确定你可以在这里使用文件输入 - 我只看到它用于监视文件的更改,而不是监视新文件。除非您的XML文件已更新,否则我认为它不会做任何事情。请记住,logstash通常会关注新的日志行。

大多数人编写如下工具来批量处理整个文件:

https://github.com/elastic/elasticsearch-river-wikipedia

https://github.com/andrewvc/wikiparse

https://github.com/elastic/stream2es

这些工具,特别是最后一个工具,似乎更接近您的用例。

答案 1 :(得分:0)

我已设法使用以下logstash配置处理每行上包含一个xml文档的文件。希望这有帮助!

input {  
    file {
        path => "/tmp/logstash/test.log"
            start_position => "beginning"
            sincedb_path => "/dev/null"
    }
}

filter {
    xml {
        source => "message"
            force_array => false
            xpath => [  
            "/Event/@timestamp", "time",
            "/Event/user[1]/id[1]/text()", "user",
            "/Event/user[1]/ip[1]/text()[1]", "ip",
            "/Event/@eventType", "eventType",
            "/Event/transactionDuration/text()", "trxDuration",
            ]
            store_xml => true
    }
}

output 
{
    stdout{
        codec => line {     
            format => "%{[time]}    %{[user]}   %{[eventType]}  %{[trxDuration]}"
        }
    }
}

input { file { path => "/tmp/logstash/test.log" start_position => "beginning" sincedb_path => "/dev/null" } } filter { xml { source => "message" force_array => false xpath => [ "/Event/@timestamp", "time", "/Event/user[1]/id[1]/text()", "user", "/Event/user[1]/ip[1]/text()[1]", "ip", "/Event/@eventType", "eventType", "/Event/transactionDuration/text()", "trxDuration", ] store_xml => true } } output { stdout{ codec => line { format => "%{[time]} %{[user]} %{[eventType]} %{[trxDuration]}" } } }