我正在尝试编写具有更高可扩展性的 .conf 文件,我的想法是,为了在 elasticsearch 中拥有多索引,拆分路径并获取具有 csv 名称的最后一个位置并将其设置为类型和弹性搜索中的索引。
import pandas as pd
import numpy as np
input {
file {
path => "/home/aitor2/RETO8/BIGDATA/df_suministro_activa.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
file {
path => "/home/aitor2/RETO8/BIGDATA/df_activo_consumo.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
path2 = path.split('/')[-1]
filter {
if [path] == "/home/aitor2/RETO8/BIGDATA/df_suministro_activa.csv"{
mutate { replace => { type => path2 } }
csv {
separator => ","
skip_header => "true"
autodetect_column_names => true
}
ruby {
code => "event.to_hash.each { |k, v|
if k.start_with?('Linea') and v.is_a?(String)
event.set(k, v.to_f)
end
}
"
}
}
else if [path] == "/home/aitor2/RETO8/BIGDATA/df_activo_consumo.csv"{
mutate { replace => { type => "apaches2" } }
csv {
separator => ","
skip_header => "true"
autodetect_column_names => true
}
ruby {
code => "event.to_hash.each { |k, v|
if k.start_with?('Smart') and v.is_a?(String)
event.set(k, v.to_f)
end
}
"
}
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "%{type}_indexer"
}
stdout {codec => rubydebug}
}
我尝试用 path2 = path.split('/')[-1]
做到这一点,但我不确定是否可行。
答案 0 :(得分:1)
在 filter
部分,将 type
的值设置为文件名(df_suministro_activa.csv
或 df_activo_consumo.csv
)。我为此使用 grok
; mutate
是另一种可能性 (cf doc)。
然后您可以在输出中/在 if-else 中使用 type
/更改其值等。
input {
file {
path => "/home/aitor2/RETO8/BIGDATA/df_suministro_activa.csv"
...
}
file {
path => "/home/aitor2/RETO8/BIGDATA/df_activo_consumo.csv"
...
}
}
filter {
grok { match { "path" => "UNIXPATH/(?<type>[^/]+)" } }
if [type] == "df_suministro_activa.csv" {
...
}
else if [type] == "df_activo_consumo.csv" {
mutate { replace => { type => "apaches2" } }
...
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "%{type}_indexer"
}
}
我不确定 path
字段;您可能想在 [log][file][path]
块中尝试 path
而不是 filter
。