使用logstash将弹性搜索中的数据导入csv文件

时间:2016-03-12 12:11:21

标签: elasticsearch logstash

我是使用logstash的新手,并且正在努力使用logstash作为csv从elasticsearch中获取数据。

要创建一些样本数据,我们可以先将一个基本的csv添加到elasticsearch中......样本csv的头部可以在下面看到

$ head uu.csv
"hh","hh1","hh3","id"
-0.979646332669359,1.65186132910743,"L",1
-0.283939374784435,-0.44785377794233,"X",2
0.922659898930901,-1.11689020559612,"F",3
0.348918777124474,1.95766948269957,"U",4
0.52667811182958,0.0168862169880919,"Y",5
-0.804765331279075,-0.186456470768865,"I",6
0.11411203100637,-0.149340801708981,"Q",7
-0.952836952412902,-1.68807271639322,"Q",8
-0.373528919496876,0.750994450392907,"F",9

然后我使用以下内容将其放入logstash中......

$ cat uu.conf 
input {
  stdin {}
}

filter {
  csv {
      columns => [
        "hh","hh1","hh3","id"
      ]
  }

  if [hh1] == "hh1" {
      drop { }
  } else {
      mutate {
          remove_field => [ "message", "host", "@timestamp", "@version" ]
      }

      mutate {
          convert => { "hh" => "float" }
          convert => { "hh1" => "float" }
          convert => { "hh3" => "string" }
          convert => { "id" => "integer" }
      }
  }
}

output {
  stdout { codec => dots }
  elasticsearch {
      index => "temp_index"
      document_type => "temp_doc"
      document_id => "%{id}"
  }
}

使用以下命令将其放入logstash ....

$ cat uu.csv | logstash-2.1.3/bin/logstash -f uu.conf 
Settings: Default filter workers: 16
Logstash startup completed
 ....................................................................................................Logstash shutdown completed

到目前为止一切顺利,但我希望得到一些数据,特别是temp_index中的hh和hh3字段。

我写了以下内容,将弹性搜索中的数据提取到csv中。

$ cat yy.conf
input {
  elasticsearch {
    hosts => "localhost:9200"
    index => "temp_index"
    query => "*"
  }
}

filter {
    elasticsearch{
        add_field => {"hh" => "%{hh}"}
        add_field => {"hh3" => "%{hh3}"}
    }
}


output {
  stdout { codec => dots }
  csv {
      fields => ['hh','hh3']
      path => '/home/username/yy.csv'
  }
}

但是在尝试运行logstash时会出现以下错误...

$ logstash-2.1.3/bin/logstash -f yy.conf
The error reported is: 
  Couldn't find any filter plugin named 'elasticsearch'. Are you sure this is correct? Trying to load the elasticsearch filter plugin resulted in this error: no such file to load -- logstash/filters/elasticsearch

我需要更改为yy.conf,以便logstash命令从elasticsearch中提取数据并输入到名为yy.csv的新csv中。

更新

yy.conf更改为以下内容......

$ cat yy.conf
input {
  elasticsearch {
    hosts => "localhost:9200"
    index => "temp_index"
    query => "*"
  }
}

filter {}

output {
  stdout { codec => dots }
  csv {
      fields => ['hh','hh3']
      path => '/home/username/yy.csv'
  }
}

我收到以下错误...

$ logstash-2.1.3/bin/logstash -f yy.conf
Settings: Default filter workers: 16
Logstash startup completed
A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["localhost:9200"], index=>"temp_index", query=>"*", codec=><LogStash::Codecs::JSON charset=>"UTF-8">, scan=>true, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"parse_exception","reason":"Failed to derive xcontent"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"init_scan","grouped":true,"failed_shards":[{"shard":0,"index":"temp_index","node":"zu3E6F7kQRWnDPY5L9zF-w","reason":{"type":"parse_exception","reason":"Failed to derive xcontent"}}]},"status":400} {:level=>:error}
A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["localhost:9200"], index=>"temp_index", query=>"*", codec=><LogStash::Codecs::JSON charset=>"UTF-8">, scan=>true, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"parse_exception","reason":"Failed to derive xcontent"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"init_scan","grouped":true,"failed_shards":[{"shard":0,"index":"temp_index","node":"zu3E6F7kQRWnDPY5L9zF-w","reason":{"type":"parse_exception","reason":"Failed to derive xcontent"}}]},"status":400} {:level=>:error}
A plugin had an unrecoverable error. Will restart this plugin.

有趣的是......如果我更改yy.conf以删除elasticsearch{}看起来像......

$ cat yy.conf
input {
  elasticsearch {
    hosts => "localhost:9200"
    index => "temp_index"
    query => "*"
  }
}

filter {
        add_field => {"hh" => "%{hh}"}
        add_field => {"hh3" => "%{hh3}"}
}


output {
  stdout { codec => dots }
  csv {
      fields => ['hh','hh3']
      path => '/home/username/yy.csv'
  }
}

我收到以下错误...

$ logstash-2.1.3/bin/logstash -f yy.conf
Error: Expected one of #, { at line 10, column 19 (byte 134) after filter {
        add_field 
You may be interested in the '--configtest' flag which you can
use to validate logstash's configuration before you choose
to restart a running system.

此外,当将yy.conf更改为类似的内容时,请考虑错误消息

$ cat yy.conf
input {
  elasticsearch {
    hosts => "localhost:9200"
    index => "temp_index"
    query => "*"
  }
}

filter {
        add_field {"hh" => "%{hh}"}
        add_field {"hh3" => "%{hh3}"}
}


output {
  stdout { codec => dots }
  csv {
      fields => ['hh','hh3']
      path => '/home/username/yy.csv'
  }
}

我收到以下错误...

$ logstash-2.1.3/bin/logstash -f yy.conf
The error reported is: 
  Couldn't find any filter plugin named 'add_field'. Are you sure this is correct? Trying to load the add_field filter plugin resulted in this error: no such file to load -- logstash/filters/add_field

*更新2 *

感谢Val我已经取得了一些进展并开始获得输出。但他们似乎并不正确。运行以下命令时,我得到以下输出...

$ cat uu.csv | logstash-2.1.3/bin/logstash -f uu.conf 
Settings: Default filter workers: 16
Logstash startup completed
....................................................................................................Logstash shutdown completed

$ logstash-2.1.3/bin/logstash -f yy.conf
Settings: Default filter workers: 16
Logstash startup completed
....................................................................................................Logstash shutdown completed

$ head uu.csv
"hh","hh1","hh3","id"
-0.979646332669359,1.65186132910743,"L",1
-0.283939374784435,-0.44785377794233,"X",2
0.922659898930901,-1.11689020559612,"F",3
0.348918777124474,1.95766948269957,"U",4
0.52667811182958,0.0168862169880919,"Y",5
-0.804765331279075,-0.186456470768865,"I",6
0.11411203100637,-0.149340801708981,"Q",7
-0.952836952412902,-1.68807271639322,"Q",8
-0.373528919496876,0.750994450392907,"F",9

$ head yy.csv
-0.106007607975644E1,F
0.385395589205671E0,S
0.722392598488791E-1,Q
0.119773830827963E1,Q
-0.151090510772458E1,W
-0.74978830916084E0,G
-0.98888121700762E-1,M
0.965827615823707E0,S
-0.165311094671424E1,F
0.523818819076447E0,R

非常感谢任何帮助......

1 个答案:

答案 0 :(得分:1)

您不需要elasticsearch过滤器,只需在csv输出中指定CSV中所需的字段,就像您一样,您应该没问题。您在CSV中需要的字段已包含在事件中,您只需将其列在fields输出的csv列表中,就像那样。

具体而言,您的配置文件应如下所示:

$ cat yy.conf
input {
  elasticsearch {
    hosts => "localhost:9200"
    index => "temp_index"
  }
}

filter {
}

output {
  stdout { codec => dots }
  csv {
      fields => ['hh','hh3']
      path => '/home/username/yy.csv'
  }
}