我需要在索引中搜索大于或等于某个短语的匹配。为了更清楚,我需要像下面的SQL一样构建查询:
SELECT * FROM Table WHERE MyNVarCharField >= 'some_string'
映射:
{
"tock": {
"mappings": {
"post": {
"properties": {
"content": {
"type": "string",
"index_analyzer": "english"
},
"id": {
"type": "double"
},
"title": {
"type": "string",
"index_analyzer": "english"
}
}
}
}
}
}
索引包含两个对象:
[
{
"id": 1,
"title": "Post1",
"content": "Ash to ash item"
},
{
"id": 2,
"title": "Post2",
"content": "Dust to dust item"
}
]
现在我想搜索内容大于或等于“尘埃项目”的对象。我尝试了许多不同的方法,最终得到了类似的东西:
{
"sort": [
{
"content": {
"order": "asc"
}
}
],
"filtered": {
"query": {
"match": {
"content": {
"query": "item"
}
}
},
"filter": {
"range": {
"content": {
"from": "Dust to dust",
"include_lower": true,
"include_upper": true
}
}
}
}
}
但它没有像我期望的那样起作用。返回两个对象。所以我需要帮助:))
以这种方式查询弹性是否真的可行?我需要做什么才能用一个短语将索引分成两部分?
顺便说一下,你应该提到保证这个短语已经存在于索引中。
答案 0 :(得分:0)
您的范围过滤器会匹配这两个文档,因为文本会与为"content"
字段生成的每个术语进行比较,而不是与原始源文本进行比较。由于english
analyzer使用standard tokenizer,因此每个文档的其中一个术语为"item"
。由于"item"
大于"dust"
,因此两个文档都匹配。
如果您的索引中包含很多文档,那么您使用的方法可能无法使用,因为会生成很多术语。
您可以做的一件事是使用"index":"not_analyzed"
字段中的"content"
设置。或者,如果您因其他原因需要对"content"
进行分析,请定义未分析的sub-field,然后针对该字段进行范围比较。这是一个例子。
所以我定义了一个索引如下:
PUT /test_index
{
"mappings": {
"post": {
"properties": {
"content": {
"type": "string",
"index_analyzer": "english",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"id": {
"type": "double"
},
"title": {
"type": "string",
"index_analyzer": "english"
}
}
}
}
}
然后添加了三个文档(您的两个加上另一个用于比较):
POST /test_index/_bulk
{"index":{"_index":"test_index","_type":"post","_id":1}}
{"id": 1,"title": "Post1", "content": "Ash to ash item"}
{"index":{"_index":"test_index","_type":"post","_id":2}}
{"id": 2,"title": "Post2", "content": "Dust to dust item"}
{"index":{"_index":"test_index","_type":"post","_id":3}}
{"id": 3,"title": "Post3", "content": "Earth to earth item"}
然后我可以对"content.raw"
使用范围查询:
POST /test_index/_search
{
"query": {
"range": {
"content.raw": {
"gte": "Dust to dust"
}
}
}
}
它会返回我的期望:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "post",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"title": "Post2",
"content": "Dust to dust item"
}
},
{
"_index": "test_index",
"_type": "post",
"_id": "3",
"_score": 1,
"_source": {
"id": 3,
"title": "Post3",
"content": "Earth to earth item"
}
}
]
}
}
修改:您可以通过将"content"
更改为"content.raw"
来调整您发布的查询(同时您的语法略有错误并给了我一个错误,因此我将查询包装起来并在"query"
块中过滤):
POST /test_index/_search
{
"sort": [
{
"content": {
"order": "asc"
}
}
],
"query": {
"filtered": {
"query": {
"match": {
"content": {
"query": "item"
}
}
},
"filter": {
"range": {
"content.raw": {
"from": "Dust to dust",
"include_lower": true,
"include_upper": true
}
}
}
}
}
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": null,
"hits": [
{
"_index": "test_index",
"_type": "post",
"_id": "2",
"_score": null,
"_source": {
"id": 2,
"title": "Post2",
"content": "Dust to dust item"
},
"sort": [
"dust"
]
},
{
"_index": "test_index",
"_type": "post",
"_id": "3",
"_score": null,
"_source": {
"id": 3,
"title": "Post3",
"content": "Earth to earth item"
},
"sort": [
"earth"
]
}
]
}
}
以下是我用于测试的代码:
http://sense.qbox.io/gist/57968fda91b9bcd5b2f1d8236ecb5fc1953800b7