如何在elasticsearch中使用Unicode字符进行搜索?

时间:2017-07-25 11:40:50

标签: php mysql elasticsearch unicode

我已将MySQL列索引到elasticsearch中,此列包含一些AR / EN / RO语言值。 如何使用unicode字符串在这些索引中搜索?

$hosts = ['localhost:9200'];              
$client = \Elasticsearch\ClientBuilder::create()->setHosts($hosts)->build();  

$body = '{  "query": {
"filtered": {
  "query": {
    "match_all": {}
  },
  "filter": {
    "bool": {
      "must": [
        {"query": {"wildcard": {"text": {"value": "*'.$term.'*"}}}},
        {"query": {"wildcard": {"group": {"value": "hotels_cities"}}}}
      ]
    }
  }
}  }}';



$params['index'] = 'my_custom_index_name';
$params['type']  = 'translator_translations';
$params['body'] = $body;

$results = $client->search($params);

输出点击数为零。

- 有一种叫做分析器的东西,但没有关于如何在PHP中使用它的信息。

2 个答案:

答案 0 :(得分:0)

我想我找到了如何在Elasticsearch中索引unicode语言字符的答案,希望这对任何人都有用。

  • 首先,您必须设置索引名称

  • 第二个使用过滤器和语言分析器设置新语言设置,如下所示:

    $client = ClientBuilder::create()       // Instantiate a new ClientBuilder
                ->setHosts(['localhost:9200'])      // Set the hosts
                ->build();
    
    $lang = 'el'; // Greek in my case
    
    $param['index'] = 'test_' . $lang; // index name
    
    // uncomment this line if you want to delete an existing index
    // $response = $client->indices()->delete($param);
    
    $body = '{
      "settings": {
        "analysis": {
          "filter": {
            "greek_stop": {
              "type":       "stop",
              "stopwords":  "_greek_" 
            },
            "greek_lowercase": {
              "type":       "lowercase",
              "language":   "greek"
            },
            "greek_keywords": {
              "type":       "keyword_marker",
              "keywords":   ["παράδειγμα"] 
            },
            "greek_stemmer": {
              "type":       "stemmer",
              "language":   "greek"
            }
          },
          "analyzer": {
            "greek": {
              "tokenizer":  "standard",
              "filter": [
                "greek_lowercase",
                "greek_stop",
                "greek_keywords",
                "greek_stemmer"
              ]
            }
          }
        }
      }
    }';
    
    $param['body'] = $body; // store the JSON body as a parameter in the main array
    
    $response = $client->indices()->create($param);
    

然后开始使用希腊字符索引您的值

答案 1 :(得分:0)

您应该使用fieldname.keyword

$rowArray = array();
$rowArray['term'] = array();
$rowArray['term']['title.keyword'] = (string)$csvRow[1];
$whereArray['bool']['must'][] = $rowArray;