在我的系统中,数据的插入总是通过logstash通过csv文件完成。我从未预先定义映射。但每当我输入一个字符串时,它总是被视为analyzed
,因此像hello I am Sinha
这样的条目被分为hello
,I
,am
, Sinha
。无论如何我可以更改elasticsearch的默认/动态映射,以便所有字符串,无论索引如何,无论类型如何都被视为not analyzed
?或者有没有办法在.conf
文件中设置它?说我的conf
文件看起来像
input {
file {
path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"
type => "promosms_dec15"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["Comm_Plan","Queue_Booking","Order_Reference","Multi_Ordertype"]
separator => ","
}
ruby {
code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"
}
}
output {
elasticsearch {
action => "index"
host => "localhost"
index => "promosms-%{+dd.MM.YYYY}"
workers => 1
}
}
我希望所有字符串都是not analyzed
,我不介意它是将所有未来数据插入到elasticsearch中的默认设置
答案 0 :(得分:28)
只需创建一个模板。运行
curl -XPUT localhost:9200/_template/template_1 -d '{
"template": "*",
"settings": {
"index.refresh_interval": "5s"
},
"mappings": {
"_default_": {
"_all": {
"enabled": true
},
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"index": "not_analyzed",
"omit_norms": true,
"type": "string"
}
}
}
],
"properties": {
"@version": {
"type": "string",
"index": "not_analyzed"
},
"geoip": {
"type": "object",
"dynamic": true,
"path": "full",
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}'
答案 1 :(得分:20)
您可以查询字段的.raw
版本。这已添加到Logstash 1.3.1:
我们提供的logstash索引模板为您索引的每个字段添加“.raw”字段。这些“.raw”字段由logstash设置为“not_analyzed”,因此不会进行分析或标记化 - 我们的原始值按原样使用!
因此,如果您的字段被称为foo
,则会查询foo.raw
以返回not_analyzed
(不在分隔符上拆分)版本。
答案 2 :(得分:13)
从Logstash发行版中复制lib / logstash / outputs / elasticsearch / elasticsearch-template.json(可能安装为/opt/logstash/lib/logstash/outputs/elasticsearch/elasticsearch-template.json),修改它通过替换
"dynamic_templates" : [ {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
与
"dynamic_templates" : [ {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "not_analyzed", "omit_norms" : true
}
}
} ],
并点template
为您输出插件到修改过的文件:
output {
elasticsearch {
...
template => "/path/to/my-elasticsearch-template.json"
}
}
您仍然可以覆盖特定字段的此默认值。
答案 3 :(得分:1)