我将拥有一个具有多部分字段的ARRAY的类型。 该字段中的数据如下所示:
grp type num
111 ABC 112233445566
123 DEF 192898048901
222 ABC 180920948012
333 XWZ 112233445566
我想在num上搜索我的文档。 我还希望能够搜索type和num来查找我的doc。 可选地包括全部三个:grp = 111 type = ABC num = 112233445566
我不想要的是这些复合值的交叉匹配.. IE,type = XWZ和num = 192898048901将是一个误命
那么我是否将这些实现为具有自定义标记生成器的multi_fields? (可能会想要创建三种关键类型)
或者复合词tokenfilter或其他一些技术可以帮助我实现这一目标。 TIA
答案 0 :(得分:0)
您可以将组合编入索引作为附加字段:
"doc" : {
"properties" : {
...
"array_type" : {
"type" : "object",
"properties" : {
"grp" : { "type" : "integer", "index" : "not_analyzed"},
"type" : { "type" : "string", "index" : "not_analyzed" },
"num" : { "type" : "integer", "index" : "not_analyzed"" },
"type_num" : { "type" : "string", "index" : "not_analyzed" },
"grp_type_num" : { "type" : "string", "index" : "not_analyzed" },
}
},
...
}
}
查询时,请使用与您拥有的信息相匹配的字段。例如,要搜索type和num,您可以编写如下查询:
{
"size": 20,
"from": 0,
"query": {
"filtered": {
"filter": {
"and": [
{
"term": {
"type_num": "XWZ 112233445566"
}
}
]
}
}
}
}
答案 1 :(得分:0)
我找到了一种更简单的方法......关键是我只需要能够通过三种可能的组合进行搜索......不需要直接引用grp typ或num。
Path_analyer正在做我想要的事情:
# Create a new index with custom path_hierarchy analyzer
# See http://www.elasticsearch.org/guide/reference/index-modules/analysis/pathhierarchy-tokenizer.html
curl -XPUT "localhost:9200/accts-test" -d '{
"settings": {
"analysis": {
"analyzer": {
"accts-analyzer": {
"type": "custom",
"tokenizer": "accts-tokenizer"
}
},
"tokenizer": {
"accts-tokenizer": {
"type": "path_hierarchy",
"delimiter": "-",
"reverse": "true"
}
}
}
},
"mappings": {
"_default_": {
"_timestamp" : {
"enabled" : true,
"store" : true
}
},
"doc": {
"properties": {
"name": { "type": "string"},
"accts": {
"type": "string",
"index_name": "acct",
"index_analyzer": "accts-analyzer",
"search_analyzer": "keyword"
}
}
}
}
}'
然后通过_analyzer端点进行测试显示:
# curious about path analyzer? test it:
echo testing analyzier
curl -XGET 'localhost:9200/accts-test/_analyze?analyzer=accts-analyzer&pretty=1' -d '111-BBB-2233445566'
echo
{
"tokens" : [ {
"token" : "111-BBB-2233445566",
"start_offset" : 0,
"end_offset" : 18,
"type" : "word",
"position" : 1
}, {
"token" : "BBB-2233445566",
"start_offset" : 4,
"end_offset" : 18,
"type" : "word",
"position" : 1
}, {
"token" : "2233445566",
"start_offset" : 8,
"end_offset" : 18,
"type" : "word",
"position" : 1
} ]
}