我有以下logstash配置文件:
input {
file
{
path => "C:\Dashboard\Elmah\*.xml"
start_position => "beginning"
type => "error"
codec => multiline
{
pattern => "^<\?error .*\>"
negate => true
what => "previous"
}
sincedb_path => "C:\Dashboard\Elmah"
}
}
filter
{
xml
{
source => "error"
xpath =>
[
"/error/@errorId", "ErrorId",
"/error/@type", "Type",
"/error/@message", "Message",
"/error/@time", "Time",
"/error/@user", "User"
]
store_xml => true
}
}
output
{
elasticsearch
{
action => "index"
host => "localhost"
index => "stock"
workers => 1
}
stdout
{
codec => rubydebug
}
}
当我运行bin / logstash -f agent.conf时,我没有收到错误,但没有数据插入Elasticsearch。该文件的一个示例是: https://www.dropbox.com/s/6oni2zhorsdtz6p/error-2015-06-26203423Z-3026bd43-07d6-44d6-a6cf-6d27b28a607e.xml?dl=0
如何让Logstash读入外部xml文件集合?
LogStash调试输出:
Please see here: https://www.dropbox.com/s/g7g1154uvf9fr1f/outputlog2.txt?dl=0
答案 0 :(得分:0)
我不确定你可以在这里使用文件输入 - 我只看到它用于监视文件的更改,而不是监视新文件。除非您的XML文件已更新,否则我认为它不会做任何事情。请记住,logstash通常会关注新的日志行。
大多数人编写如下工具来批量处理整个文件:
https://github.com/elastic/elasticsearch-river-wikipedia
https://github.com/andrewvc/wikiparse
https://github.com/elastic/stream2es
这些工具,特别是最后一个工具,似乎更接近您的用例。
答案 1 :(得分:0)
我已设法使用以下logstash配置处理每行上包含一个xml文档的文件。希望这有帮助!
input {
file {
path => "/tmp/logstash/test.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
xml {
source => "message"
force_array => false
xpath => [
"/Event/@timestamp", "time",
"/Event/user[1]/id[1]/text()", "user",
"/Event/user[1]/ip[1]/text()[1]", "ip",
"/Event/@eventType", "eventType",
"/Event/transactionDuration/text()", "trxDuration",
]
store_xml => true
}
}
output
{
stdout{
codec => line {
format => "%{[time]} %{[user]} %{[eventType]} %{[trxDuration]}"
}
}
}
input {
file {
path => "/tmp/logstash/test.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
xml {
source => "message"
force_array => false
xpath => [
"/Event/@timestamp", "time",
"/Event/user[1]/id[1]/text()", "user",
"/Event/user[1]/ip[1]/text()[1]", "ip",
"/Event/@eventType", "eventType",
"/Event/transactionDuration/text()", "trxDuration",
]
store_xml => true
}
}
output
{
stdout{
codec => line {
format => "%{[time]} %{[user]} %{[eventType]} %{[trxDuration]}"
}
}
}