使用Logstash导入csv时,ElasticSearch会引发字符串到日期转换错误

时间:2016-12-25 18:36:28

标签: csv elasticsearch logstash logstash-configuration

我正在尝试使用Logstash运行一个简单的csv文件到ElasticSearch。

但是当我运行它时,我将以下错误转换为日期格式(第一个 Date 列)。

"error"=>{
"type"=>"mapper_parsing_exception", 
"reason"=>"failed to parse [Date]",
"caused_by"=>{
    "type"=>"illegal_argument_exception", 
    "reason"=>"Invalid format: \"Date\""}}}}

当我删除Date列时,一切都很棒。

我正在使用以下csv文件:

Date,Open,High,Low,Close,Volume,Adj Close
2015-04-02,125.03,125.56,124.19,125.32,32120700,125.32
2015-04-01,124.82,125.12,123.10,124.25,40359200,124.25
2015-03-31,126.09,126.49,124.36,124.43,41852400,124.43
2015-03-30,124.05,126.40,124.00,126.37,46906700,126.37

以及以下logstash.conf:

input {
  file {
    path => "path/file.csv"
    type => "core2"
    start_position => "beginning"    
  }
}
filter {
  csv {
      separator => ","
      columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
  }
  mutate {convert => ["High", "float"]}
  mutate {convert => ["Open", "float"]}
  mutate {convert => ["Low", "float"]}
  mutate {convert => ["Close", "float"]}
  mutate {convert => ["Volume", "float"]}
  date {
    match => ["Date", "yyyy-MM-dd"]
    target => "Date"
  }
}
output {  
    elasticsearch {
        action => "index"
        hosts => "localhost"
        index => "stock15"
        workers => 1
    }
    stdout {}
}

似乎我正在处理日期。知道什么可能出错吗?

谢谢!

2 个答案:

答案 0 :(得分:1)

问题出在文件本身。 Logstash正在读取第一行,但无法解析:

Date,Open,High,Low,Close,Volume,Adj Close

正确的解决方法是删除文件的标题:

2015-04-02,125.03,125.56,124.19,125.32,32120700,125.32
2015-04-01,124.82,125.12,123.10,124.25,40359200,124.25
2015-03-31,126.09,126.49,124.36,124.43,41852400,124.43
2015-03-30,124.05,126.40,124.00,126.37,46906700,126.37

它应该没问题。

GitHub上有一个问题:https://github.com/elastic/logstash/issues/2088

答案 1 :(得分:0)

谢谢@Yeikel,我最终改变了logstash配置,而不是数据本身。

在应用csv过滤器之前,我用regex检查它是否是标题。所以,如果它是标题我放弃它,并继续到下一行(将使用csv过滤器处理)

请参阅解决标题问题的更新配置:

input {  
  file {
    path => "path/file.csv"
    start_position => "beginning"    
  }
}
filter {  
    if ([message] =~ "\bDate\b") {
        drop { }
    } else {
        csv {
            separator => ","
            columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
        }
        mutate {convert => ["High", "float"]}
        mutate {convert => ["Open", "float"]}
        mutate {convert => ["Low", "float"]}
        mutate {convert => ["Close", "float"]}
        mutate {convert => ["Volume", "float"]}
      date {
        match => ["Date", "yyyy-MM-dd"]
      }
    }
}
output {  
    elasticsearch {
        action => "index"
        hosts => "localhost"
        index => "stock15"
        workers => 1
    }
    stdout {
        codec => rubydebug
     }
}