模糊不能按预期运行(一个术语搜索,参见示例)

时间:2015-01-27 00:51:10

标签: elasticsearch

考虑以下结果:

curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d 
'{ "query" : 
     {"match":  
        {"last_name": "Smith"}
     }
  }'

结果:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.30685282,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 0.30685282,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing on the weekends.",
          "interests": [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 0.30685282,
        "_source": {
          "first_name": "Jane",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
}

现在,当我执行以下查询时:

curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d 
'{ "query" : 
        {"fuzzy": 
             {"last_name": 
                  {"value":"Smitt", 
                   "fuzziness": 1
                  }
              }
         }
 }'

尽管Levenshtein距离“Smith”和“Smitt”为1,否则返回NO结果。同样的结果是值为“Smit”。如果我输入fuzziness值为2,我会得到结果。我在这里缺少什么?

1 个答案:

答案 0 :(得分:1)

我假设您查询的last_name字段是一个分析字符串。索引字词将为smith而不是Smith

  

尽管Levenshtein距离为" Smith"和   " Smitt"是1。

fuzzy查询不会分析术语,所以实际上,你的Levenshtein距离不是1而是2:

  1. Smitt - >史密斯
  2. 史密斯 - >史密斯
  3. 尝试使用此映射,您的fuzziness = 1的查询将起作用:

    PUT /megacorp/employee/_mapping
    {
      "employee":{
        "properties":{
          "last_name":{
            "type":"string",
            "index":"not_analyzed"
          }
        }
      }
    }
    

    希望这有帮助