使用PHP中的变音/重音进行ElasticSearch查询

时间:2016-08-09 14:32:37

标签: php elasticsearch encoding diacritics

我有以下表达式:“noaptebună”当我在搜索“bună”或“buna”时,我正试图得到相同的结果。

我在这里接受了教程:https://www.elastic.co/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html但没有结果。

这是我的代码:

$params = ['index' => 'asciiv3', 'body' => [
    "settings" => [
        "analysis" => [
            "analyzer" => [
                "folding" => [
                    "tokenizer" => "standard",
                    "filter" =>  [ "lowercase", "asciifolding" ]
                ]
            ]
        ]
    ],
    "mappings" => [
        "asciiv3" => [
            "properties" => [
                "saying" => [
                    "type" =>           "string",
                    "analyzer" =>       "standard",
                    "fields" => [
                        "folded" => [
                            "type" =>       "string",
                            "analyzer" =>   "folding"
                        ]
                    ]
                ]
            ]
        ]
    ]
]];
self::$instance->indices()->create($params);

这是查询数组:

'multi_match' =>
    array(
        "type" =>     "most_fields",
        "query" =>    "bună",
        "fields" => [ "saying", "saying.folded" ]
    )

有谁知道我做错了什么?

1 个答案:

答案 0 :(得分:0)

它对我有用。这是我的设置:

PUT asciiv3
{
  "settings": {
    "analysis": {
      "analyzer": {
        "folding": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "asciiv3": {
      "properties": {
        "saying": {
          "type": "string",
          "analyzer": "standard",
          "fields": {
            "folded": {
              "type": "string",
              "analyzer": "folding"
            }
          }
        }
      }
    }
  }
}
POST /asciiv3/asciiv3/1
{
  "saying":"bună ziua"
}
POST /asciiv3/asciiv3/2
{
  "saying":"buna ziua"
}

GET /asciiv3/_search
{
  "query": {
    "multi_match": {
      "type": "most_fields",
      "query": "bună",
      "fields": [
        "saying",
        "saying.folded"
      ]
    }
  }
}

有了这些结果:

   "hits": {
      "total": 2,
      "max_score": 0.2712221,
      "hits": [
         {
            "_index": "asciiv3",
            "_type": "asciiv3",
            "_id": "1",
            "_score": 0.2712221,
            "_source": {
               "saying": "bună ziua"
            }
         },
         {
            "_index": "asciiv3",
            "_type": "asciiv3",
            "_id": "2",
            "_score": 0.028130025,
            "_source": {
               "saying": "buna ziua"
            }
         }
      ]
   }