多字同义词和短语查询

时间:2015-08-20 18:58:15

标签: elasticsearch

Elastic文档中有错误吗?

给出以下索引映射:

PUT /my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym_filter": {
          "type": "synonym",
          "synonyms": [
            "usa,united states,u s a,united states of america"
          ]
        }
      },
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_synonym_filter"
          ]
        }
      }
    }
  }
}

鉴于此文件:

put /my_index/country/1
{
  "title" : "The United States is wealthy"
}

在文件中说明:

这些短语不匹配:

美国富裕

美国富裕

美国是富裕的

然而,这些短语会:

美国富裕

美国的富裕国家

富裕的美国

U.S。是美国

然而,情况似乎并非如此 - 应该匹配的短语根本不匹配!这是我正在运行的查询(根据documentation在查询时没有同义词扩展):

GET /my_index/country/_search
{

    "query" : {
        "match_phrase" : {
            "title" : {
               "query" : "United States is wealthy",
               "analyzer": "standard"
            }

        }
    }
}

我在这里缺少什么?

2 个答案:

答案 0 :(得分:1)

文档中的示例适用于我。

可能您忘记在映射中为title字段设置分析器。

示例:

1)创建索引

PUT /my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym_filter": {
          "type": "synonym",
          "synonyms": [
            "usa,united states,u s a,united states of america"
          ]
        }
      },
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "my_synonym_filter"
          ]
        }
      }
    }
  }
}

2)添加映射

PUT my_index/country/_mapping
{
    "properties" : {
        "title" : {"type" : "string","analyzer" : "my_synonyms"}
    }
}

3)索引文件

PUT /my_index/country/1
{
  "title" : "The United States is wealthy"
}

4)查询

GET /my_index/country/_search
{

    "query" : {
        "match_phrase" : {
            "title" : {
               "query" : "United States is wealthy",
               "analyzer": "standard"
            }

        }
    }
}

5)回应:

{
   "took": 8,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.75942194,
      "hits": [
         {
            "_index": "my_index",
            "_type": "country",
            "_id": "1",
            "_score": 0.75942194,
            "_source": {
               "title": "The United States is wealthy"
            }
         }
      ]
   }
}

答案 1 :(得分:1)

如此接近,你错过了一件事!

在您的查询中,您应该更改分析仪!您必须针对my_synonym分析器运行查询文本才能匹配同义词。目前,您使用standard分析器进行查询,该分析器只会将您的文字标记为unitedstatesiswealthy,而不是全部使用所有你的同义词。

改变这个:

GET /my_index/country/_search
{

    "query" : {
        "match_phrase" : {
            "title" : {
               "query" : "United States is wealthy",
               "analyzer": "standard"
            }

        }
    }
}

对此:

GET /my_index/country/_search
{

    "query" : {
        "match_phrase" : {
            "title" : {
               "query" : "United States is wealthy",
               "analyzer": "my_synonyms"
            }

        }
    }
}

现在,当您进行查询时,文本United States将正确标记为usa