Elasticsearch 5.5:需要帮助进行名称搜索(职位得分更高)

时间:2017-07-12 19:24:10

标签: elasticsearch

我正在尝试制作一个自动填充功能,可以为键入的单词位置提供更高的分数。按分数排序然后命名。

我的目标:

输入“pet”

结果:

peter christensen
peter christian grau
peter christian reumert krogsgaard
peter bruun christensen
anders petersen

输入“peter chr”

结果:

peter christensen
peter christian grau
peter christian reumert krogsgaard
peter bruun christensen

上面的成功,我很高兴,但是当有重复的“peter christensen”时,发生了一些奇怪的事情。现在结果如下:

peter christian grau
peter christian reumert krogsgaard
peter christensen
peter christensen
peter bruun christensen

我想要这个:

peter christensen
peter christensen
peter christian grau
peter christian reumert krogsgaard
peter bruun christensen

请帮帮忙?

设置:

{
    "persons_index": {
        "settings": {
            "index": {
                "number_of_shards": "5",
                "provided_name": "persons_index",
                "creation_date": "1499881803116",
                "analysis": {
                    "filter": {
                        "ascii_folding_preserve_original": {
                            "type": "asciifolding",
                            "preserve_original": "true"
                        },
                        "names_synonym_filter": {
                            "type": "synonym",
                            "synonyms": [
                                "aage,åge",
                                "gaard,gård"
                            ]
                        }
                    },
                    "analyzer": {
                        "my_analyzer": {
                            "filter": [
                                "lowercase",
                                "trim",
                                "names_synonym_filter",
                                "ascii_folding_preserve_original"
                            ],
                            "type": "custom",
                            "tokenizer": "standard"
                        }
                    }
                },
                "number_of_replicas": "1",
                "uuid": "LxMHLha-R02i__S7gDWBtA",
                "version": {
                    "created": "5050099"
                }
            }
        }
    }
}

映射:

{
    "persons_index": {
        "mappings": {
            "persons_type": {
                "properties": {
                    "fieldDisplayFullName": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "my_analyzer"
                    },
                    "fieldSort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "keyword",
                        "fielddata": true
                    }
                }
            }
        }
    }
}

示例文档

{"fieldSort" : "peter christensen", "fieldDisplayFullName" : "Peter Christensen"}

查询:

_search漂亮=真安培; SEARCH_TYPE = dfs_query_then_fetch

{
  "explain": false,
  "size" : 50,
  "sort": [
        "_score",
        {
            "fieldSort": {
                "order": "asc"
            }
        }
    ],
  "query": {
    "bool" : {
      "must" : {
        "bool" : {
              "minimum_should_match":"2",
              "should" : [
                {"match" : { "fieldDisplayFullName" : "peter" }},
                {"wildcard" : { "fieldDisplayFullName" : "chr*" }}
              ]
        }
      },
       "should" : [
          {"match_phrase_prefix" : { "fieldSort" : {"query" : "peter chr","boost" : 10}}}
        ]
    }
  }
}

结果:

Peter Christensen (_score: 18.968887)
Peter Christian Engelhardt (_score: 18.968887)
Peter Christian Grau (_score: 18.968887)
Peter Christian Reumert Krogsgaard (_score: 18.968887)
Peter Christian Vagnbo Jørgensen (_score: 18.968887)
Peter Christoffersen (_score: 18.968887)
Peter Bruun Christensen (_score: 1.0512933)
Peter Dits Christensen (_score: 1.0512933)
Peter Fjeldsø Christensen (_score: 1.0512933)

重复“Peter Christensen(_score:14.909464)”

Peter Christian Engelhardt (_score: 20.01772)
Peter Christian Grau (_score: 20.01772)
Peter Christian Reumert Krogsgaard (_score: 20.01772)
Peter Christian Vagnbo Jørgensen (_score: 20.01772)
Peter Christoffersen (_score: 20.01772)
Peter Christensen (_score: 14.909464)
Peter Christensen (_score: 14.909464)
Peter Bruun Christensen (_score: 1.04652)
Peter Dits Christensen (_score: 1.04652)
Peter Fjeldsø Christensen (_score: 1.04652)

不同的查询尝试没有成功:

{
  "explain": false,
  "size" : 50,
  "sort": [
        "_score",
        {
            "fieldSort": {
                "order": "asc"
            }
        }
    ],
  "query": {
    "bool" : {
      "must" : {
        "bool" : {
              "minimum_should_match":"2",
              "should" : [
                {"match" : { "fieldDisplayFullName" : "peter" }},
                {"wildcard" : { "fieldDisplayFullName" : "chr*" }}
              ]
        }
      },
       "should" : [
            {"span_first" : {
              "match" : {
                  "span_term" : { "fieldDisplayFullName" : "peter" }
              },
              "end" : 1,
              "boost" : 50.0
            }},
            {"span_first" : {
              "match" : {
                  "span_multi":{
                    "match":{
                        "wildcard" : { "fieldDisplayFullName" : "chr*" }
                    }
                  }
              },
              "end" : 2,
              "boost" : 50.0
            }}
        ]
    }
  }
}

2 个答案:

答案 0 :(得分:1)

解决了“constant_score”

{
  "explain": false,
  "size" : 50,
  "sort": [
        "_score",
        {
            "fieldSort": {
                "order": "asc"
            }
        }
    ],
  "query": {
    "bool" : {
      "must" : {
        "bool" : {
              "minimum_should_match":"2",
              "should" : [
                {"match" : { "fieldDisplayFullName" : "peter" }},
                {"wildcard" : { "fieldDisplayFullName" : "chr*" }}
              ]
        }
      },
       "should" : [
            { "constant_score": {
                "boost":   2,
                "query": {"match_phrase_prefix" : { "fieldSort" : {"query" : "peter chr"}}}
            }}
        ]
    }
  }
}

答案 1 :(得分:0)

详细了解Elasticsearch computes relevance scores如何IDF(逆文档频率)。

因此,您可以尝试多个approaches来忽略IDF来解决此问题。我有一个类似的问题,我通过将已应用的search type更改为dfs_query_then_fetch来解决问题。

您的配置是否超过1 shard

干杯,多米尼克