Question

这是一个由两部分组成的问题。

我的文件如下：

{"url": "https://someurl.com", 
 "content": "searchable content here", 
 "hash": "c54cc9cdd4a79ca10a891b8d1b7783c295455040", 
 "headings": "more searchable content", 
 "title": "Page Title"}

我的第一个问题是如何检索“标题”完全“无标题”的所有文档。我不希望出现标题为“此文件没有标题”的文件。

我的第二个问题是如何在长长的网址列表中检索“网址”完全的所有文档。

我正在使用pyelasticsearch，但curl中的通用答案也可以。

Answer 1

您必须为字段定义映射。

如果您要查找精确值（区分大小写），可以将index属性设置为not_analyzed。

类似的东西：

"url" : {"type" : "string", "index" : "not_analyzed"}

Answer 2

如果您存储了源（默认设置），则可以使用script filter

它应该是这样的：

$ curl -XPUT localhost:9200/index/type/1 -d '{"foo": "bar"}'
$ curl -XPUT localhost:9200/index/type/2 -d '{"foo": "bar baz"}'
$ curl -XPOST localhost:9200/index/type/_search?pretty=true -d '{
"filter": {
    "script": {
        "script": "_source.foo == \"bar\""
    }
}
}'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "index",
      "_type" : "type",
      "_id" : "1",
      "_score" : 1.0, "_source" : {"foo": "bar"}
    } ]
  }
}

编辑：我认为值得一提的是“not_analyzed”映射应该是更快的方法。但是如果你想要这个字段的精确匹配和部分匹配，我会看到两个选项：使用脚本或将数据索引两次（一次分析，一次不分析）。

Answer 3

尝试这种方法。它的工作。

import json
from elasticsearch import Elasticsearch
connection = Elasticsearch([{'host': host, 'port': port}])

elastic_query = json.dumps({
     "query": {
         "match_phrase": {
            "UserName": "name"
          }
      }
 })
result = connection.search(index="test_index", body=elastic_query)

如何让elasticsearch执行完全匹配查询？

3 个答案: