我正在尝试使用Logstash运行一个简单的csv文件到ElasticSearch。
但是当我运行它时,我将以下错误转换为日期格式(第一个 Date 列)。
"error"=>{
"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [Date]",
"caused_by"=>{
"type"=>"illegal_argument_exception",
"reason"=>"Invalid format: \"Date\""}}}}
当我删除Date列时,一切都很棒。
我正在使用以下csv文件:
Date,Open,High,Low,Close,Volume,Adj Close
2015-04-02,125.03,125.56,124.19,125.32,32120700,125.32
2015-04-01,124.82,125.12,123.10,124.25,40359200,124.25
2015-03-31,126.09,126.49,124.36,124.43,41852400,124.43
2015-03-30,124.05,126.40,124.00,126.37,46906700,126.37
以及以下logstash.conf:
input {
file {
path => "path/file.csv"
type => "core2"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
}
mutate {convert => ["High", "float"]}
mutate {convert => ["Open", "float"]}
mutate {convert => ["Low", "float"]}
mutate {convert => ["Close", "float"]}
mutate {convert => ["Volume", "float"]}
date {
match => ["Date", "yyyy-MM-dd"]
target => "Date"
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost"
index => "stock15"
workers => 1
}
stdout {}
}
似乎我正在处理日期。知道什么可能出错吗?
谢谢!
答案 0 :(得分:1)
问题出在文件本身。 Logstash正在读取第一行,但无法解析:
Date,Open,High,Low,Close,Volume,Adj Close
正确的解决方法是删除文件的标题:
2015-04-02,125.03,125.56,124.19,125.32,32120700,125.32
2015-04-01,124.82,125.12,123.10,124.25,40359200,124.25
2015-03-31,126.09,126.49,124.36,124.43,41852400,124.43
2015-03-30,124.05,126.40,124.00,126.37,46906700,126.37
它应该没问题。
GitHub上有一个问题:https://github.com/elastic/logstash/issues/2088
答案 1 :(得分:0)
谢谢@Yeikel,我最终改变了logstash配置,而不是数据本身。
在应用csv过滤器之前,我用regex检查它是否是标题。所以,如果它是标题我放弃它,并继续到下一行(将使用csv过滤器处理)
请参阅解决标题问题的更新配置:
input {
file {
path => "path/file.csv"
start_position => "beginning"
}
}
filter {
if ([message] =~ "\bDate\b") {
drop { }
} else {
csv {
separator => ","
columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
}
mutate {convert => ["High", "float"]}
mutate {convert => ["Open", "float"]}
mutate {convert => ["Low", "float"]}
mutate {convert => ["Close", "float"]}
mutate {convert => ["Volume", "float"]}
date {
match => ["Date", "yyyy-MM-dd"]
}
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost"
index => "stock15"
workers => 1
}
stdout {
codec => rubydebug
}
}