Elasticsearch查询为子字符串返回零结果

时间:2020-07-21 15:58:19

标签: elasticsearch elasticsearch-query

我创建了我的第一个AWS ElasticSearch集群并将其上载一些数据(如下所示)。

搜索example.com之类的域时,结果为零。

这是搜索查询还是索引问题?

# curl -XGET -u username:password 'https://xxxxx.us-east-1.es.amazonaws.com/hosts/_search?q=example.com&pretty=true'
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

我确认match_all查询确实返回了所有记录。

match_all

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "hosts",
        "_type" : "_doc",
        "_id" : "KK0PcnMBqk4TBzxZPeGU",
        "_score" : 1.0,
        "_source" : {
          "name" : "mail.stackoverflow.com",
          "type" : "a",
          "value" : "10.0.0.3"
        }
      },
      {
        "_index" : "hosts",
        "_type" : "_doc",
        "_id" : "J60PcnMBqk4TBzxZPeGU",
        "_score" : 1.0,
        "_source" : {
          "name" : "ns1.guardian.co.uk",
          "type" : "a",
          "value" : "10.0.0.2"
        }
      },
      {
        "_index" : "hosts",
        "_type" : "_doc",
        "_id" : "Ka0PcnMBqk4TBzxZPeGU",
        "_score" : 1.0,
        "_source" : {
          "name" : "test.example.com",
          "type" : "a",
          "value" : "10.0.0.4"
        }
      }
    ]
  }
}

批量上传命令

curl -XPUT -u username:password https://xxxxx.us-east-1.es.amazonaws.com/_bulk --data-binary @bulk.json -H 'Content-Type: application/json'

bulk.json

{ "index" : { "_index": "hosts" } }
{"name":"ns1.guardian.co.uk","type":"a","value":"10.0.0.2"}
{ "index" : { "_index": "hosts" } }
{"name":"mail.stackoverflow.com","type":"a","value":"10.0.0.3"}
{ "index" : { "_index": "hosts" } }
{"name":"test.example.com","type":"a","value":"10.0.0.4"}

1 个答案:

答案 0 :(得分:1)

您可以使用Path hierarchy tokenizer,它采用诸如文件系统路径之类的分层值,在路径分隔符上拆分,并为树中的每个组件发出一个术语。

索引映射:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "path-analyzer": {
          "type": "custom",
          "tokenizer": "path-tokenizer"
        }
      },
      "tokenizer": {
        "path-tokenizer": {
          "type": "path_hierarchy",
          "delimiter": ".",
          "reverse": "true"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "path-analyzer",
        "search_analyzer": "keyword"
      }
    }
  }
}

分析API

在上面的索引映射中,reverse设置为true,它将以相反的顺序发出令牌。 (reverse默认设置为false

POST /hosts/_analyze
{
  "analyzer": "path-analyzer",
  "text": "test.example.com"
}

这将产生三个令牌:

{
"tokens": [
    {
        "token": "test.example.com",
        "start_offset": 0,
        "end_offset": 16,
        "type": "word",
        "position": 0
    },
    {
        "token": "example.com",
        "start_offset": 5,
        "end_offset": 16,
        "type": "word",
        "position": 0
    },
    {
        "token": "com",
        "start_offset": 13,
        "end_offset": 16,
        "type": "word",
        "position": 0
    }
]

}

搜索查询:

    {
  "query": {
    "term": {
      "name": "example.com"
    }
  }
}

搜索结果:

"hits": [
  {
    "_index": "hosts",
    "_type": "_doc",
    "_id": "d67gdHMBcF4W0YVjq8ed",
    "_score": 1.3744103,
    "_source": {
      "name": "test.example.com",
      "type": "a",
      "value": "10.0.0.4"
    }
  }
]
相关问题