如何在我输入时使用ElasticSearch建议(结果)城市

时间:2014-12-11 20:52:30

标签: elasticsearch elasticsearch-plugin

我是Elasticsearch的新手,我花了几个小时试图解决这个问题,如果你想帮助我,请提前致谢。

:)(不太)简短说明:(到目前为止我所拥有的以及我尝试实现的目标):

我创建了一个CouchDB数据库(spain_locales),其中包含8000多个西班牙城市和省份文档。另一方面,我有一个带有jQuery自动完成的HTML表单,我在输入时显示结果。我从我创建的PHP(Laravel服务提供程序)连接到ElasticSearch,并返回jQuery自动完成的结果。我想这可以通过从客户端直接连接到ElasticSearch来实现,但出于安全考虑,我现在更喜欢它。

:(问题:

我从ElasticSearch获得的结果并不完全符合我的预期,我不知道如何修复我所拥有的或者是否正确的方法。我不知道bool是否会查询我需要的内容,或者是否应该使用其他类型的查询。

  1. 如果我输入与数据库中完全相同的字词,我只会得到结果:

    如果我输入Álava,我会获得结果,但不是 Alava Á重音会产生差异)

  2. 在输入完整的单词之前,我没有获得结果:

    如果我输入Albacete,我会获得结果但不是 Albacet

  3. 我使用CouchDB River Plugin for ElasticSearch将CouchDB与ElasticSearch同步>> https://github.com/elasticsearch/elasticsearch-river-couchdb我使用以下命令通过终端:

    curl -XPUT 'localhost:9200/_river/spain_locales/_meta' -d '{
        "type" : "couchdb",
        "couchdb" : {
            "host" : "localhost",
            "port" : 5984,
            "db" : "spain_locales",
            "filter" : null
        },
        "index" : {
            "index" : "spain_locales",
            "type" : "spain_locales",
            "bulk_size" : "100",
            "bulk_timeout" : "10ms"
        }
    }'
    

    我也尝试过:

    curl -XPUT 'localhost:9200/_river/spain_locales/_meta' -d '{
        "type" : "couchdb",
        "couchdb" : {
            "host" : "localhost",
            "port" : 5984,
            "db" : "spain_locales",
            "filter" : null
        },
        "index" : {
            "number_of_shards" : 2,
            "refresh_interval" : "1s",
            "analysis": {
              "analyzer": {
                "folding": {
                  "tokenizer": "standard",
                  "filter":  [ "lowercase", "asciifolding" ]
                }
              }
            },
            "index" : "spain_locales",
            "type" : "spain_locales",
            "bulk_size" : "100",
            "bulk_timeout" : "10ms"
        }
    }'
    

    以上都不会返回任何错误并成功创建_river同步,但仍会出现重音和整个单词问题。

    我还尝试以某种方式通过终端使用以下命令应用所需的过滤器:

    curl -XPUT 'localhost:9200/spain_locales/' -d '
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "folding": {
              "tokenizer": "standard",
              "filter":  [ "lowercase", "asciifolding" ]
            }
          }
        }
      },
      "uuid":"KwKrBc3uQoG5Ld1nOdc5rQ"
    }'
    

    但是我收到以下错误:

    {"error":"IndexAlreadyExistsException[[spain_locales] already exists]","status":400}
    

    CouchDB文档示例:

    {
       "_id": "1",
       "_rev": "1-087ddbe8593f68f1d7d37a9c3f6de787",
       "Provincia": "Álava",
       "Poblacion": "Alegría-Dulantzi",
       "helper": ""
    }
    
    {
       "_id": "10",
       "_rev": "1-ce38dcdabeb3b34d34d2296c6e2fdf24",
       "Provincia": "Álava",
       "Poblacion": "Ayala/Aiara",
       "helper": ""
    }
    
    {
       "_id": "100",
       "_rev": "1-72e66601e378ee48519aa93601dc0717",
       "Provincia": "Albacete",
       "Poblacion": "Herrera (La)",
       "helper": "La Herrera"
    }
    

    PHP服务提供商/控制器:

    public function searchzones(){
    
        $q = (Input::has('term')) ? Input::get('term') : 'null';
    
        $params['index'] = 'spain_locales';
        $params['type']  = 'spain_locales';
    
        $params['body']['query']['bool']['should'] = array(
            array('match' => array('Poblacion' =>  $q)),
            array('match' => array('Provincia' =>  $q))
        );
    
        $query = $this->elasticsearch->search($params);
    
        if ($query['hits']['total'] >= 1){
    
            $results = $query['hits']['hits'];
    
            foreach ($results as $zone) {
    
                $databag[] = array( "value"     => $zone['_source']['Poblacion'].', '.$zone['_source']['Provincia'],
                                    "state"     => $zone['_source']['Provincia'],
                                    "city"      => $zone['_source']['Poblacion'],
                );
    
            }
    
        } else {
    
            $results = ['res' => null];
            $databag[] = array();
    
        }
    
        return $databag;
    
        } // End Search Zones
    

    jQuery(JavaScript):

    // Sugest locations when user type in zones 
    $(document).ready(function() {
        $('#zones').autocomplete({
    
                source : applink + 'ajax/searchzones',
                select : function(event, ui){
                    console.log(ui);
                }
    
        }); // End autocomplete
    }); // End Document ready
    

    HTML表单部分(Twitter Bootstrap):

    <div class="form-group">
    <div class="input-group input-append dropdown">
    <input type="text" class="form-control typeahead" placeholder="City name" name="zones" id="zones">
    <div class="input-group-btn" >
    <button type="button" class="btn btn-default dropdown-toggle" data-toggle="dropdown"><span class="caret"></span></button>
    <ul class="dropdown-menu dropdown-menu-right" id="dropZonesAjax">                           
    </ul>
    </div>
    </div>
    <div id="zonesAjax"></div>   
    </div>
    

    我找到了以下资源:http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html但我不知道如何实现/实现它。

    非常感谢您的时间并尝试提供帮助! 对不起我的英语不好!

1 个答案:

答案 0 :(得分:0)

尝试在编制索引之前创建映射。然后,您可以定义您提到的分析器(折叠)并将其分配给您的字段:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "folding": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "locales": {
      "properties": {
        "Provincia": {
          "type": "string",
          "analyzer": "folding"
        },
        "Poblacion": {
          "type": "string",
          "analyzer": "folding"
        },
        "helper": {
          "type": "string"
        }
      }
    }
  }
}