从文件中获取JSON

时间:2017-02-28 10:44:24

标签: logstash

Logstash 5.2.1

我无法使用Logstash从a local file读取JSON文档。标准输出中没有文档。

我像这样运行Logstash:

./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --config.reload.automatic 

Logstash config:

input {
  file {
    path => "/home/trex/Development/Shipping_Data_To_ES/shakespeare.json"
    codec => json {}   
    start_position => "beginning"
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

另外,我尝试使用charset

...
codec => json {
  charset => "UTF-8"
}
...

另外,我尝试使用/ json编解码器输入和使用过滤器:

...
filter {
  json {
    source => "message"
  }
}
...

启动后的Logstash控制台:

[2017-02-28T11:37:29,947][WARN ][logstash.agent           ] fetched new config for pipeline. upgrading.. {:pipeline=>"main", :config=>"input {\n  file {\n    path => \"/home/trex/Development/Shipping_Data_To_ES/shakespeare.json\"\n    codec => json {\n      charset => \"UTF-8\"\n    }\n    start_position => \"beginning\"\n  }\n}\n#filter {\n#  json {\n#    source => \"message\"\n#  }\n#}\noutput {\n  stdout {\n    codec => rubydebug\n  }\n}\n\n"}
[2017-02-28T11:37:29,951][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
[2017-02-28T11:37:30,434][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-02-28T11:37:30,446][INFO ][logstash.pipeline        ] Pipeline main started
^C[2017-02-28T11:40:55,039][WARN ][logstash.runner          ] SIGINT received. Shutting down the agent.
[2017-02-28T11:40:55,049][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}
^C[2017-02-28T11:40:55,475][FATAL][logstash.runner          ] SIGINT received. Terminating immediately..
The signal INT is in use by the JVM and will not work correctly on this platform
[trex@Latitude-E5510 Shipping_Data_To_ES]$ ./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --config.test_and_exit
^C[trex@Latitude-E5510 Shipping_Data_To_ES]$ ./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --confireload.automatic
^C[trex@Latitude-E5510 Shipping_Data_To_ES]$ ./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --config.reload.aumatic
Sending Logstash's logs to /home/trex/Development/Shipping_Data_To_ES/logstash-5.2.1/logs which is now configured via log4j2.properties
[2017-02-28T11:45:48,752][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-02-28T11:45:48,785][INFO ][logstash.pipeline        ] Pipeline main started
[2017-02-28T11:45:48,875][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

为什么Logstash没有将我的JSON文档放在stdout中?

2 个答案:

答案 0 :(得分:2)

您是否尝试在file输入中包含文件type

input {
  file {
    path => "/home/trex/Development/Shipping_Data_To_ES/shakespeare.json"
    type => "json"  <-- add this
    //codec => json {} <-- for the moment i'll comment this  
    start_position => "beginning"
  }
}

然后让你的过滤器

filter{
    json{
        source => "message"
    }
}

如果您使用的是codec插件,请务必在input内附上摘要:

codec => "json"

您可能还想试用json_lines插件。希望这个thread派上用场。

答案 1 :(得分:0)

sincedb_path似乎对读取JSON文件很重要。我只能在添加此选项后导入JSON。需要保持文件中的当前位置,以便在导入中断时能够从该位置恢复。我不需要任何位置跟踪,所以我只需将其设置为/dev/null即可。

基本工作Logstash配置:

input {
  file {
    path => ["/home/trex/Development/Shipping_Data_To_ES/shakespeare.json"]
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}
output {
  stdout {
    codec => json_lines
  }
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "shakespeare" 
  }  
}