Question

我正在实施名称搜索，其中可能的字段为middle_initial，last_name和cross_match。查询通常是姓氏，例如＆＃34;史密斯，A＆＃34;在寻找＆＃34; Smith，Ashley＆＃34;而不是＆＃34; A Smith＆＃34;。

我的成绩得分不合理（ Angela和Alex应该高于Robert和Ted ）：

＆＃34; Smith，Roger A＆＃34;
＆＃34; Smith，Ted A＆＃34;
＆＃34; Smith，Angela D＆＃34;
＆＃34; Smith，Alex N＆＃34;

我在索引和查询方面都尝试了很多东西，我必须包含大量的模糊性（拼写和拼音）。通过n-gram分析器的_score查询+一些模糊性满足了我的大多数需求，除此之外。修改：上面的列表按GET /_search { "query": { "bool": { "should": [ { "multi_match": { "query": "smith, a", "type": "cross_fields", "fields": [ "first_name_middle_initial^5", "last_name^10" ] } }, { "multi_match": { "query": "smith, a", "type": "cross_fields", "fields": [ "first_name_middle_initial.phonetic^2", "last_name.phonetic^5" ] } }, { "multi_match": { "query": "smith, a", "type": "cross_fields", "fields": [ "first_name_middle_initial.analyzed^2", "last_name.analyzed^10" ] } }, { "bool": { "should": [ { "match": { "last_name.word_start": { "query": "smith, a", "boost": 10, "operator": "and", "analyzer": "searchkick_word_search" } } }, { "match": { "last_name.word_start": { "query": "smith, a", "boost": 5, "operator": "and", "analyzer": "searchkick_word_search", "fuzziness": 1, "prefix_length": 0, "max_expansions": 3, "fuzzy_transpositions": true } } } ] } }, { "bool": { "should": [ { "match": { "first_name_middle_initial.word_start": { "query": "smith, a", "boost": 10, "operator": "and", "analyzer": "searchkick_word_search" } } } ] } } ] } } }排序，因此我无法按其他方式排序。

查询示例，我试图查看是否索引第一个＆amp;中间名一起有所作为：

first_name

我也在提升，试图淹没中间初始匹配的任何东西，甚至不包括我的查询中的中间首字母或我在查询中引用的字段（例如这只是{{1}}。我不能完全忽略中间首字母，以防它是差异化字段。

Answer 1

好吧，我的一个问题可能是过时索引。否则，键似乎使用ngram分析器作为我的cross_fields匹配之一，并确保middle_initial被认为是完全独立的（有点像一个决胜局）。将它放在bool子查询中是有意的 - 我不希望它和该子句中的其他子查询被认为具有与cross_fields匹配相同的权重，如{{3 }}

这里最终解决了我的问题：

索引映射：

{
  <snip>
      "first_name": {
        "type": "text",
        "fields": {
          "phonetic": {
            "type": "text",
            "analyzer": "dbl_metaphone"
          },
          "word_start": {
            "type": "text",
            "analyzer": "searchkick_word_start_index" // includes "lowercase", "asciifolding", "searchkick_edge_ngram" (ngram from the start of the word)
          }
        }
      },
      <snip>
      "last_name": {
        "type": "text",
        "fields": {
          "phonetic": {
            "type": "text",
            "analyzer": "dbl_metaphone"
          },
          "word_start": {
            "type": "text",
            "analyzer": "searchkick_word_start_index"
          }
        }
      },
      "middle_initial": {
        "type": "keyword",
        "fields": {
          "analyzed": {
            "type": "text",
            "analyzer": "searchkick_index" // includes lowercase, asciifolding, shingles, stemmer
          }
        },
        "ignore_above": 30000
      },
      <snip>
    }
  }
}

<强>查询：

{
  "query": {
    "bool": {
      "should": [
        [
          {
            "multi_match": {
              "query": "smith, s",
              "type": "cross_fields",
              "fields": [
                "first_name^2",
                "last_name^3"
              ],
              "tie_breaker": 0.3
            }
          },
          {
            "multi_match": {
              "query": "smith, s",
              "type": "cross_fields",
              "fields": [
                "first_name.phonetic",
                "last_name.phonetic"
              ],
              "tie_breaker": 0.3
            }
          },
          {
            "multi_match": {
              "query": "smith, s",
              "type": "cross_fields",
              "fields": [
                "first_name.word_start",
                "last_name.word_start^2"
              ],
              "tie_breaker": 0.3
            }
          }
        ],
        {
          "bool": {
            "should": [
            <snip subquery for another field>
              {
                "match": {
                  "middle_initial.analyzed": {
                    "query": "s",
                    "operator": "and"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

将部分字段匹配排序到不同字段上的完全匹配之上的查询

1 个答案: