我很难找到弹性搜索查询的意外结果。将以下文档编入索引以进行弹性搜索。
{
"group": "J00-I99", codes: [
{ "id": "J15", "description": "hello world" },
{ "id": "J15.0", "description": "test one world" },
{ "id": "J15.1", "description": "test two world J15.0" },
{ "id": "J15.2", "description": "test two three world J15" },
{ "id": "J15.3", "description": "hello world J18 " },
............................ // Similar records here
{ "id": "J15.9", "description": "hello world new" },
{ "id": "J16.0", "description": "new description" }
]
}
在这里,我的目标是实现自动完成功能,为此,我使用了n-gram方法。我不想使用完整的建议方法。
目前,我遇到两个问题:
预期结果:以上所有结果,其中包括J15 实际结果:仅获得很少的结果(J15.0,J15.1,J15.8)
预期结果:
{ "id": "J15.1", "description": "test two world J15.0" },
{ "id": "J15.2", "description": "test two three world J15" },
实际结果:
{ "id": "J15.0", "description": "test one world" },
{ "id": "J15.1", "description": "test two world J15.0" },
{ "id": "J15.2", "description": "test two three world J15" },
然后完成映射。
{
settings: {
number_of_shards: 1,
analysis: {
filter: {
ngram_filter: {
type: 'edge_ngram',
min_gram: 2,
max_gram: 20
}
},
analyzer: {
ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: [
'lowercase', 'ngram_filter'
]
}
}
}
},
mappings: {
properties: {
group: {
type: 'text'
},
codes: {
type: 'nested',
properties: {
id: {
type: 'text',
analyzer: 'ngram_analyzer',
search_analyzer: 'standard'
},
description: {
type: 'text',
analyzer: 'ngram_analyzer',
search_analyzer: 'standard'
}
}
}
}
}
}
搜索查询:
GET myindex/_search
{
"_source": {
"excludes": [
"codes"
]
},
"query": {
"nested": {
"path": "codes",
"query": {
"bool": {
"should": [
{
"match": {
"codes.description": "J15"
}
},
{
"match": {
"codes.id": "J15"
}
}
]
}
},
"inner_hits": {}
}
}
}
注意:文档索引将很大。这里仅提及示例数据。
对于第二个问题,我可以将multi_match与如下所示的AND运算符一起使用吗?
GET myindex/_search
{
"_source": {
"excludes": [
"codes"
]
},
"query": {
"nested": {
"path": "codes",
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "J15",
"fields": ["codes.id", "codes.description"],
"operator": and
}
}
]
}
},
"inner_hits": {}
}
}
}
由于我在解决此问题上遇到困难,因此我们将不胜感激。
答案 0 :(得分:1)
问题是,默认情况下,inner_hits
仅返回this official doc中提到的3个匹配文档,
大小
每个inner_hits返回的最大匹配数。 默认情况下, 返回前三个匹配项。
只需在您的inner_hits中添加size
参数即可获得所有搜索结果。
"inner_hits": {
"size": 10 // note this
}
在示例数据中进行了尝试,并看到了第一个查询的搜索结果,该查询仅返回3个搜索结果
第一个查询搜索结果
"hits": [
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 2
},
"_score": 1.8687118,
"_source": {
"id": "J15.1",
"description": "test two world J15.0"
}
},
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 3
},
"_score": 1.7934312,
"_source": {
"id": "J15.2",
"description": "test two three world J15"
}
},
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 0
},
"_score": 0.29618382,
"_source": {
"id": "J15",
"description": "hello world"
}
},
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 1
},
"_score": 0.29618382,
"_source": {
"id": "J15.0",
"description": "test one world"
}
},
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 4
},
"_score": 0.29618382,
"_source": {
"id": "J15.3",
"description": "hello world J18 "
}
},
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 5
},
"_score": 0.29618382,
"_source": {
"id": "J15.9",
"description": "hello world new"
}
}
]
}
}
}
}
答案 1 :(得分:1)
添加另一个答案,因为它是另一个问题,而第一个答案则集中在第一个问题上。
问题是您的第二个查询test two
返回了test one world
,并且在索引时您使用的是ngram_analyzer
,而该{<1>使用的是标准分析器,该分析器将文本分割为白色,空格,并且您的搜索分析器再次为standard
,因此,如果在索引文档和搜索词上使用Analyze API,您将看到它与标记匹配:
{
"text" : "test one world",
"analyzer" : "standard"
}
并生成令牌
{
"tokens": [
{
"token": "test",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "one",
"start_offset": 5,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "world",
"start_offset": 9,
"end_offset": 14,
"type": "<ALPHANUM>",
"position": 2
}
]
}
对于您的搜索字词test two
{
"tokens": [
{
"token": "test",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "two",
"start_offset": 5,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 1
}
]
}
如您所见,文档中存在test
令牌,因此您可以获得该搜索结果。可以通过在查询中使用AND运算符来解决此问题,如下所示
搜索查询
{
"_source": {
"excludes": [
"codes"
]
},
"query": {
"nested": {
"path": "codes",
"query": {
"bool": {
"must": {
"multi_match": {
"query": "test two",
"fields": [
"codes.id",
"codes.description"
],
"operator" :"AND"
}
}
}
},
"inner_hits": {}
}
}
}
和搜索结果
"hits": [
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 2
},
"_score": 2.6901608,
"_source": {
"id": "J15.1",
"description": "test two world J15.0"
}
},
{
"_index": "myindexedge64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 3
},
"_score": 2.561376,
"_source": {
"id": "J15.2",
"description": "test two three world J15"
}
}
]
}
}
}
}
答案 2 :(得分:0)
添加带有索引映射,搜索查询和搜索结果的工作示例
索引映射:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"group": {
"type": "text"
},
"codes": {
"type": "nested",
"properties": {
"id": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
}
索引数据:
{
"group": "J00-I99",
"codes": [
{
"id": "J15",
"description": "hello world"
},
{
"id": "J15.0",
"description": "test one world"
},
{
"id": "J15.1",
"description": "test two world J15.0"
},
{
"id": "J15.2",
"description": "test two three world J15"
},
{
"id": "J15.3",
"description": "hello world J18 "
},
{
"id": "J15.9",
"description": "hello world new"
},
{
"id": "J16.0",
"description": "new description"
}
]
}
搜索查询:
{
"_source": {
"excludes": [
"codes"
]
},
"query": {
"nested": {
"path": "codes",
"query": {
"bool": {
"should": [
{
"match": {
"codes.description": "J15"
}
},
{
"match": {
"codes.id": "J15"
}
}
],
"must": {
"multi_match": {
"query": "test two",
"fields": [
"codes.id",
"codes.description"
],
"type": "phrase"
}
}
}
},
"inner_hits": {}
}
}
}
搜索结果:
"inner_hits": {
"codes": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 3.2227304,
"hits": [
{
"_index": "stof_64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 3
},
"_score": 3.2227304,
"_source": {
"id": "J15.2",
"description": "test two three world J15"
}
},
{
"_index": "stof_64170045",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "codes",
"offset": 2
},
"_score": 2.0622847,
"_source": {
"id": "J15.1",
"description": "test two world J15.0"
}
}
]
}
}
}
}