我正在构建一个程序,它将Apache日志文件传递给Logstash,并将结果(在解析和过滤之后)输出到外部数据库(Elastic,MongoDB等)。
基本上,程序将执行以下命令:
gunzip -c -k "somefile.gz" | logstash -f "logstash.conf"
logstash.conf
包含:
input {
stdin {}
}
filter {
# ...
}
output {
mongodb {
# ...
}
}
但是,为了帮助未来的调试/取证,我想抓住Logstash在其过程中所说的一切。
默认情况下,似乎Logstash将其内部日志消息输出到stdout,例如,以下命令:
logstash -e "input { generator { count => 3 } } output { null {} }"
输出:
Sending Logstash's logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2018-03-28T22:42:53,484][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/usr/share/logstash/modules/fb_apache/configuration"}
[2018-03-28T22:42:53,515][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/usr/share/logstash/modules/netflow/configuration"}
[2018-03-28T22:42:54,222][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2018-03-28T22:42:55,092][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.2.3"}
[2018-03-28T22:42:55,795][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-03-28T22:42:58,225][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-03-28T22:42:58,407][INFO ][logstash.pipeline ] Pipeline started succesfully {:pipeline_id=>"main", :thread=>"#<Thread:0x35d85de2 run>"}
[2018-03-28T22:42:58,559][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}
[2018-03-28T22:42:58,867][INFO ][logstash.pipeline ] Pipeline has terminated {:pipeline_id=>"main", :thread=>"#<Thread:0x35d85de2 run>"}
添加--log.level=debug
给出了更多:与上一个命令相同,加上更多内容,包括以下与消息相关的日志:
[2018-03-28T22:44:30,671][DEBUG][logstash.pipeline ] filter received {"event"=>{"@timestamp"=>2018-03-28T22:44:30.567Z, "sequence"=>1, "@version"=>"1", "message"=>"Hello world!", "host"=>"5e26a941934f"}}
[2018-03-28T22:44:30,679][DEBUG][logstash.pipeline ] output received {"event"=>{"@timestamp"=>2018-03-28T22:44:30.567Z, "sequence"=>1, "@version"=>"1", "message"=>"Hello world!", "host"=>"5e26a941934f"}}
[2018-03-28T22:44:30,680][DEBUG][logstash.pipeline ] filter received {"event"=>{"@timestamp"=>2018-03-28T22:44:30.568Z, "sequence"=>2, "@version"=>"1", "message"=>"Hello world!", "host"=>"5e26a941934f"}}
...
很多噪音,但在追踪错误时会很有用。
我在程序中直接捕获logstash
命令的stdout和stderr时遇到了麻烦,但无论如何,我发现让Logstash直接写入日志文件更容易。
根据the documentation将--path.logs=/tmp
添加到logstash
命令应该使其将日志文件写入/tmp
,但事实并非如此。在以下命令结束后(在执行期间),/tmp
中没有日志文件:
logstash \
--path.logs=/tmp \
--log.level=debug \
-e "input { generator { count => 3 } } output { null {} }"
那么:如何使logstash将日志消息(而不是它处理的数据)写入文件?