弹性bool查询必须匹配mot考虑

时间:2017-01-12 13:40:04

标签: elasticsearch

我基本上是在尝试编写一个查询,它应该将文档返回到哪里 学校是"圣洁国际"和等级是"第二"。 但是当前查询的问题是它没有考虑必须匹配的查询部分。即使我没有指明学校是给我这个文件,因为它不匹配。 查询给了我所有等级为第二的文件。 我只想要文件所在的学校是什么?#34;圣洁国际"和等级是"第二"。 以及我没有在匹配查询中指定" schools.school"但它给了我结果。

映射

{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_keyword_lowercase1": {
                    "tokenizer": "keyword",
                    "filter": ["lowercase", "my_pattern_replace1", "trim"]
                },
                "my_keyword_lowercase2": {
                    "tokenizer": "standard",
                    "filter": ["lowercase", "trim"]
                }
            },
            "filter": {
                "my_pattern_replace1": {
                    "type": "pattern_replace",
                    "pattern": ".",
                    "replacement": ""
                }

            }
        }
    },
    "mappings": {
        "test_data": {
            "properties": {
                "schools": {
                    "type": "nested",
                    "properties": {
                        "school": {
                            "type": "string",
                            "analyzer": "my_keyword_lowercase1"
                        },
                        "grade": {
                            "type": "string",
                            "analyzer": "my_keyword_lowercase2"
                        }
                    }
                }
            }
        }
    }
}

数据

{
    "_index": "data_index",
    "_type": "test_data",
    "_id": "57a33ebc1d41",
    "_version": 1,
    "found": true,
    "_source": {
        "summary": null,
        "schools": [{
                "school": "little flower",
                "grade": "first",
                "date": "2007-06-01",
            },
            {
            "school": "holy international",
            "grade": "second",
            "date": "2007-06-01",
        },
        ],
        "first_name": "Adam",
        "location": "Kansas City",
        "last_name": "Roger",
        "country": "US",
        "name": "Adam Roger",
    }
}

查询

{
    "_source": ["first_name"],
    "query": {
        "nested": {
            "path": "schools",
            "inner_hits": {
                "_source": {
                    "includes": [
                        "schools.school",
                        "schools.grade"
                    ]
                }
            },
            "query": {
                "bool": {
                    "must": {
                        "match": {
                            "schools.school": {
                                "query": ""  <-----X didnt specify anything
                            }
                        }
                    },
                    "filter": {
                        "match": {
                            "schools.grade": {
                                "query": "second",
                                "operator": "and",
                                "minimum_should_match": "100%"
                            }
                        }
                    }
                }
            }
        }
    }
}

结果

{
  "took": 26,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "data_test",
        "_type": "test_data",
        "_id": "57a33ebc1d41",
        "_score": 0.2876821,
        "_source": {
          "first_name": "Adam"
        },
        "inner_hits": {
          "schools": {
            "hits": {
              "total": 1,
              "max_score": 0.2876821,
              "hits": [
                {
                  "_nested": {
                    "field": "schools",
                    "offset": 0
                  },
                  "_score": 0.2876821,
                  "_source": {
                    "schools": {
                      "school": "holy international",
                      "grade": "second"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

1 个答案:

答案 0 :(得分:1)

所以,基本上你的问题是分析步骤,当我加载所有内容并进行检查时,它变得非常明确:

此过滤器完全擦除schools.school字段

中的所有字符串
  "filter": {
    "my_pattern_replace1": {
      "type": "pattern_replace",
      "pattern": ".",
      "replacement": ""
    }
  }

我认为,这种情况正在发生,因为.是regexp文字,因此,当我检查它时:

POST /_analyze

{
  "field": "schools.school",
  "text": "holy international"
}

{
    "tokens": [
        {
            "token": "",
            "start_offset": 0,
            "end_offset": 18,
            "type": "word",
            "position": 0
        }
    ]
}

这就是为什么你总是得到一个匹配,你在索引时间和搜索时间内传递的每个字符串变为“”。来自Elastic wiki的一些额外信息 - https://www.elastic.co/guide/en/elasticsearch/reference/5.1/analysis-pattern_replace-tokenfilter.html

删除模式替换过滤器后,此查询将按预期返回所有内容:

{
    "_source": ["first_name"],
    "query": {
        "nested": {
            "path": "schools",
            "inner_hits": {
                "_source": {
                    "includes": [
                        "schools.school",
                        "schools.grade"
                    ]
                }
            },
            "query": {
                "bool": {
                    "must": {
                        "match": {
                            "schools.school": {
                                "query": "holy international"  
                            }
                        }
                    },
                    "filter": {
                        "match": {
                            "schools.grade": {
                                "query": "second"
                            }
                        }
                    }
                }
            }
        }
    }
}