我能够让Logstash解析来自各种来源的数据并将其发送给Elasticsearch,但我对使用文件输入没有太大帮助。
目录/data
包含以名称开头的各种输入文件,如a
,后跟日期。今天,该目录可能包含尚未解析的a_20170611.csv
和a_20170612.csv
。每天都会添加一个新文件a
。因此,将添加明天a_20170613.csv
,并保留原始文件(至少在它们被清除之前的短时间内)。
当我第一次启动Logstash时,我希望Logstash获取已存在的文件并将数据发送到Elasticsearch。然后,当每天都有新文件可用时,我希望Logstash获取新文件并将数据发送到Elasticsearch。旧文件永远不会包含新数据,因此Logstash应该在处理完毕后忽略它们。
这是我的 logstash-pipeline.conf :
input {
file { path => "/data/a_*.csv"
delimiter => "|"
type => "a" }
file { path => "/data/b_*.csv"
delimiter => "|"
type => "b" }
file { path => "/data/c_*.csv"
delimiter => "|"
type => "c" }
file { path => "/data/d_*.csv"
delimiter => "|"
type => "d" }
}
output {
if [type] == "a" {
elasticsearch { hosts => ["ip:port"]
index => "logstash-a-%{+YYYYMMDD}" }
}
if [type] == "b" {
elasticsearch { hosts => ["ip:port"]
index => "logstash-b-%{+YYYYMMDD}" }
}
if [type] == "c" {
elasticsearch { hosts => ["ip:port"]
index => "logstash-c-%{+YYYYMMDD}" }
}
if [type] == "d" {
elasticsearch { hosts => ["ip:port"]
index => "logstash-d-%{+YYYYMMDD}" }
}
}