如何在Elasticsearch 6.8中微调包含最多搜索词的搜索结果?

时间:2019-08-26 12:27:14

标签: elasticsearch elasticsearch-6.8

下面是我的映射:

{
  "mappings": {
    "_doc": {
      "properties": {
        "text": { 
          "type": "text",
          "fields": {
            "raw": { 
              "type":     "keyword",
              "normalizer": "case_insensitive"
            }
          }
        }
      }
    }
  }
}

设置如下:

{
  "settings": {
    "index": {
      "analysis" : {
        "normalizer" : {
          "case_insensitive" : {
            "filter" : "lowercase"
          }
        },
        "analyzer" : {
          "en_std" : {
            "type" : "standard",
            "stopwords" : "_english_"
          }
        }
      },
    }
  }
} 

下面是我的查询:

{
  "query": {
    "bool" : {
      "must" : {
        "query_string" : {
          "query" : "hawaii beach 2019",
          "analyze_wildcard: true,
          "fields": [
            "text"
          ]
        }
      },
    }
  }
}

以下是存储在Elasticsearch中的示例数据:

[
  {
     "text": "blue hawaii hotel"
  },
  {
     "text": "costa beach"
  },
  {
     "text": "white hawaii beach"
  },
  {
     "text": "nice hotel 2019"
  },
  {
     "text": " some 2019 white beach hawaii photo"
  },
  {
     "text": "hawaii vacation 2019"
  },
]

如果我的搜索词是hawaii,我会得到三个结果:

[
  {
     "text": "blue hawaii hotel"
  },
  {
     "text": "white hawaii beach"
  },
  {
     "text": " some 2019 white beach hawaii beach photo"
  },
]

如果我的搜索词是hawaii beach,我会得到四个结果:

[
  {
     "text": "blue hawaii hotel"
  },
  {
     "text": "costa beach"
  },
  {
     "text": "white hawaii beach"
  },
  {
     "text": " some 2019 white beach hawaii photo"
  },
]

如果我的搜索词是hawaii beach 2019,我将获得五个结果:

[
  {
     "text": "blue hawaii hotel"
  },
  {
     "text": "costa beach"
  },
  {
     "text": "white hawaii beach"
  },
  {
     "text": "nice hotel 2019"
  },
  {
     "text": " some 2019 white beach hawaii photo"
  },
]

这是因为每个记录都包含一个单词的搜索文本。这是有道理的,但这并不是我想要的。我希望包含最多匹配单词的记录出现在搜索结果的顶部,而包含更少匹配单词的记录出现在搜索结果的底部。如何在Elasticsearch 6.8中做到这一点?如果无法实现,则还希望仅显示包含最匹配单词的记录作为搜索结果。

所需的搜索结果,例如我的搜索文字是hawaii beach 2019

[
  {
     "text": " some 2019 white beach hawaii photo" // Contains most matching words.
  },
  {
     "text": "white hawaii beach"
  },
  {
     "text": "blue hawaii hotel" // Contains less matching words.
  },
  {
     "text": "costa beach" // Contains less matching words.
  },

  {
     "text": "nice hotel 2019" // Contains less matching words.
  },

]

[
  {
     "text": " some 2019 white beach hawaii photo" // Contains most matching words
  },
]

2 个答案:

答案 0 :(得分:0)

您可以修改输入查询:

hawaii AND beach AND 2019

然后您将获得所有3个单词的结果。

答案 1 :(得分:0)

我想我已经找到了一种变通方法,将*包围在搜索字符串中的每个单词,如下所示。

{ 
  "query": { 
    "bool": { 
      "must": { 
        "bool": { 
          "should": { 
            "query_string": { 
              "query": "*hawaii* *beach* *2019*", 
              "fields": ["text"]
            } 
          } 
        } 
      } 
    } 
  } 
}

通过此查询,我将获得所有包含至少一个单词的搜索字符串的文档。搜索词匹配最多的文档显示在列表顶部。