我有一个包含以下结构的CSV文件
col1, col2, col3
1|E|D
2|A|F
3|E|F
...
我正在尝试使用logstash在ElasticSearch上对其进行索引,因此我创建了以下logstash配置文件:
input {
file {
path => "/path/to/data"
start_position => "beginning"
}
}
filter {
csv {
separator => "|"
columns => ["col1","col2","col3"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "myindex"
document_type => "mydoctype"
}
stdout {}
}
但是除了以下内容之外,logstash暂停,没有消息:
$ /opt/logstash/bin/logstash -f logstash.conf
Settings: Default pipeline workers: 8
Pipeline main started
增加详细程度会给出以下消息(不包含任何特定错误)
$ /opt/logstash/bin/logstash -v -f logstash.conf
starting agent {:level=>:info}
starting pipeline {:id=>"main", :level=>:info}
Settings: Default pipeline workers: 8
Registering file input {:path=>["/path/to/data"], :level=>:info}
No sincedb_path set, generating one based on the file path {:sincedb_path=>"/home/username/.sincedb_55b24c6ff18079626c5977ba5741584a", :path=>["/path/to/data"], :level=>:info}
Using mapping template from {:path=>nil, :level=>:info}
Attempting to install template {:manage_template=>{"template"=>"logstash-*", "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "omit_norms"=>true}, "dynamic_templates"=>[{"message_field"=>{"match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}, "fields"=>{"raw"=>{"type"=>"string", "index"=>"not_analyzed", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"string", "index"=>"not_analyzed"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"float"}, "longitude"=>{"type"=>"float"}}}}}}}, :level=>:info}
New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["localhost:9200"], :level=>:info}
Starting pipeline {:id=>"main", :pipeline_workers=>8, :batch_size=>125, :batch_delay=>5, :max_inflight=>1000, :level=>:info}
Pipeline main started
有关如何索引csv文件的建议吗?
答案 0 :(得分:1)
如果在测试期间,您之前已经处理过该文件,则logstash会在输出引用的sincedb文件中记录该文件(inode和字节偏移量)。您可以删除文件(如果不需要),或在文件{}输入中设置sincedb_path。
答案 1 :(得分:1)
由于logstash尝试不重播旧文件行,因此您可以尝试使用tcp输入并将文件netcat到开放端口。
输入部分如下所示:
input {
tcp {
port => 12345
}
}
然后,当logstash正在运行并侦听端口时,您可以使用以下命令发送数据:
cat /path/to/data | nc localhost 12345