将部分字段匹配排序到不同字段上的完全匹配之上的查询

时间:2017-12-06 22:23:40

标签: elasticsearch full-text-search

我正在实施名称搜索,其中可能的字段为middle_initiallast_namecross_match。查询通常是姓氏,例如"史密斯,A"在寻找" Smith,Ashley"而不是" A Smith"。

我的成绩得分不合理( Angela和Alex应该高于Robert和Ted ):

  • " Smith,Roger A"
  • " Smith,Ted A"
  • " Smith,Angela D"
  • " Smith,Alex N"

我在索引和查询方面都尝试了很多东西,我必须包含大量的模糊性(拼写和拼音)。通过n-gram分析器的_score查询+一些模糊性满足了我的大多数需求,除此之外。修改:上面的列表按GET /_search { "query": { "bool": { "should": [ { "multi_match": { "query": "smith, a", "type": "cross_fields", "fields": [ "first_name_middle_initial^5", "last_name^10" ] } }, { "multi_match": { "query": "smith, a", "type": "cross_fields", "fields": [ "first_name_middle_initial.phonetic^2", "last_name.phonetic^5" ] } }, { "multi_match": { "query": "smith, a", "type": "cross_fields", "fields": [ "first_name_middle_initial.analyzed^2", "last_name.analyzed^10" ] } }, { "bool": { "should": [ { "match": { "last_name.word_start": { "query": "smith, a", "boost": 10, "operator": "and", "analyzer": "searchkick_word_search" } } }, { "match": { "last_name.word_start": { "query": "smith, a", "boost": 5, "operator": "and", "analyzer": "searchkick_word_search", "fuzziness": 1, "prefix_length": 0, "max_expansions": 3, "fuzzy_transpositions": true } } } ] } }, { "bool": { "should": [ { "match": { "first_name_middle_initial.word_start": { "query": "smith, a", "boost": 10, "operator": "and", "analyzer": "searchkick_word_search" } } } ] } } ] } } } 排序,因此我无法按其他方式排序。

查询示例,我试图查看是否索引第一个&中间名一起有所作为:

first_name

我也在提升,试图淹没中间初始匹配的任何东西,甚至不包括我的查询中的中间首字母或我在查询中引用的字段(例如这只是{{1}}。我不能完全忽略中间首字母,以防它是差异化字段。

1 个答案:

答案 0 :(得分:1)

好吧,我的一个问题可能是过时索引。否则,键似乎使用ngram分析器作为我的cross_fields匹配之一,并确保middle_initial被认为是完全独立的(有点像一个决胜局)。将它放在bool子查询中是有意的 - 我不希望它和该子句中的其他子查询被认为具有与cross_fields匹配相同的权重,如{{3 }}

这里最终解决了我的问题:

索引映射:

{
  <snip>
      "first_name": {
        "type": "text",
        "fields": {
          "phonetic": {
            "type": "text",
            "analyzer": "dbl_metaphone"
          },
          "word_start": {
            "type": "text",
            "analyzer": "searchkick_word_start_index" // includes "lowercase", "asciifolding", "searchkick_edge_ngram" (ngram from the start of the word)
          }
        }
      },
      <snip>
      "last_name": {
        "type": "text",
        "fields": {
          "phonetic": {
            "type": "text",
            "analyzer": "dbl_metaphone"
          },
          "word_start": {
            "type": "text",
            "analyzer": "searchkick_word_start_index"
          }
        }
      },
      "middle_initial": {
        "type": "keyword",
        "fields": {
          "analyzed": {
            "type": "text",
            "analyzer": "searchkick_index" // includes lowercase, asciifolding, shingles, stemmer
          }
        },
        "ignore_above": 30000
      },
      <snip>
    }
  }
}

<强>查询:

{
  "query": {
    "bool": {
      "should": [
        [
          {
            "multi_match": {
              "query": "smith, s",
              "type": "cross_fields",
              "fields": [
                "first_name^2",
                "last_name^3"
              ],
              "tie_breaker": 0.3
            }
          },
          {
            "multi_match": {
              "query": "smith, s",
              "type": "cross_fields",
              "fields": [
                "first_name.phonetic",
                "last_name.phonetic"
              ],
              "tie_breaker": 0.3
            }
          },
          {
            "multi_match": {
              "query": "smith, s",
              "type": "cross_fields",
              "fields": [
                "first_name.word_start",
                "last_name.word_start^2"
              ],
              "tie_breaker": 0.3
            }
          }
        ],
        {
          "bool": {
            "should": [
            <snip subquery for another field>
              {
                "match": {
                  "middle_initial.analyzed": {
                    "query": "s",
                    "operator": "and"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}