我想构建大型嵌套查询(它们会很大但很简单)并且在嵌套它时会一直遇到错误。我尝试了几种变体(基于documentation),我得到的错误通常是filter malformed, no field after start_object
。
我想要构建的查询是一个布尔化合物:
AND
OR
我使用的示例数据:
{'N_timeend_epoch': 10, 'N_marker': True, 'N_hostip': 'A'}
{'N_timeend_epoch': 10, 'N_marker': True, 'N_hostip': 'B'}
{'N_timeend_epoch': 11, 'N_marker': True, 'N_hostip': 'A'}
{'N_timeend_epoch': 11, 'N_marker': True, 'N_hostip': 'B'}
{'N_timeend_epoch': 10, 'N_marker': False, 'N_hostip': 'A'}
{'N_timeend_epoch': 11, 'N_marker': False, 'N_hostip': 'B'}
{'N_timeend_epoch': 11, 'N_marker': False, 'N_hostip': 'B'}
它们被正确加载到elasticsearch:
curl http://localhost:9200/yop/_search?pretty
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 7,
"max_score" : 1.0,
"hits" : [ {
"_index" : "yop",
"_type" : "document",
"_id" : "AUpEErMEPK-TLWy_CSAU",
"_score" : 1.0,
"_source":{"N_hostip": "A", "N_timeend_epoch": 10, "N_marker": true}
}, {
"_index" : "yop",
"_type" : "document",
"_id" : "AUpEErMEPK-TLWy_CSAZ",
"_score" : 1.0,
"_source":{"N_hostip": "B", "N_timeend_epoch": 11, "N_marker": false}
},
(...)
我正在查看具有特定N_timeend_epoch
和N_hostip
的条目。以下代码将显示搜索查询:
import requests
list_markers = list()
for N_hostip, N_timeend_epoch in [("A", 10), ("B", 10)]:
list_markers.append(
{
"query":
{
"filtered":
{
"filter":
{
"bool":
{
"must":
[
{"N_hostip": N_hostip},
{'N_timeend_epoch': N_timeend_epoch}
]
}
}
}
}
}
)
q = {
"query": {
"filtered": {
"filter": { "bool": { "should": list_markers } }
}
}
}
url = "http://localhost:9200/yop/_search"
r = requests.get(url=url, data=json.dumps(q))
print(r.json())
我希望得到文件
{'N_timeend_epoch': 10, 'N_marker': True, 'N_hostip': 'A'},
{'N_timeend_epoch': 10, 'N_marker': True, 'N_hostip': 'B'},
{'N_timeend_epoch': 10, 'N_marker': False, 'N_hostip': 'A'},
上面构建的JSON(json.dumps(q)
)是
{
"query":{
"filtered":{
"filter":{
"bool":{
"should":[
{
"query":{
"bool":{
"must":[
{
"N_hostip":"A"
},
{
"N_timeend_epoch":10
}
]
}
}
},
{
"query":{
"bool":{
"must":[
{
"N_hostip":"B"
},
{
"N_timeend_epoch":10
}
]
}
}
}
]
}
}
}
}
}
我不明白如何将query
与filter/filtered
结合起来。我曾尝试仅使用filter/filtered
来包含所有查询,以及模式的几种组合,但它们都会导致错误
{u'status': 400, u'error': u'SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[bUIc4GtASg-1iFokFMwI8A][yop][0]: SearchParseException[[yop][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"filtered": {"filter": {"bool": {"should": [{"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "A"}, {"N_timeend_epoch": 10}]}}}}}, {"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "B"}, {"N_timeend_epoch": 10}]}}}}}]}}}}}]]]; nested: QueryParsingException[[yop] [_na] filter malformed, no field after start_object]; }{[bUIc4GtASg-1iFokFMwI8A][yop][1]: SearchParseException[[yop][1]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"filtered": {"filter": {"bool": {"should": [{"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "A"}, {"N_timeend_epoch": 10}]}}}}}, {"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "B"}, {"N_timeend_epoch": 10}]}}}}}]}}}}}]]]; nested: QueryParsingException[[yop] [_na] filter malformed, no field after start_object]; }{[bUIc4GtASg-1iFokFMwI8A][yop][2]: SearchParseException[[yop][2]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"filtered": {"filter": {"bool": {"should": [{"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "A"}, {"N_timeend_epoch": 10}]}}}}}, {"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "B"}, {"N_timeend_epoch": 10}]}}}}}]}}}}}]]]; nested: QueryParsingException[[yop] [_na] filter malformed, no field after start_object]; }{[bUIc4GtASg-1iFokFMwI8A][yop][3]: SearchParseException[[yop][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"filtered": {"filter": {"bool": {"should": [{"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "A"}, {"N_timeend_epoch": 10}]}}}}}, {"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "B"}, {"N_timeend_epoch": 10}]}}}}}]}}}}}]]]; nested: QueryParsingException[[yop] [_na] filter malformed, no field after start_object]; }{[bUIc4GtASg-1iFokFMwI8A][yop][4]: SearchParseException[[yop][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query": {"filtered": {"filter": {"bool": {"should": [{"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "A"}, {"N_timeend_epoch": 10}]}}}}}, {"query": {"filtered": {"filter": {"bool": {"must": [{"N_hostip": "B"}, {"N_timeend_epoch": 10}]}}}}}]}}}}}]]]; nested: QueryParsingException[[yop] [_na] filter malformed, no field after start_object]; }]'}
如何正确构建此类查询?
注意:我最初添加了python
标记,因为我的代码是基于Python的,但问题在于弹性搜索的语法。如果你觉得这样更好,请随意添加。
答案 0 :(得分:0)
有几种方法可以解决这个问题;我将在下面分享一个。我使用了Elasticsearch 1.3.4。
首先让我说,如果您还没有看到Chrome的Sense plug-in,那么您应该查看它。自动完成有助于整理复杂的Elasticsearch语法。在Qbox,我们构建了一个修改后的版本,让我们可以共享Elasticsearch代码(您可能会说是Sense和Github Gist的组合)。以下是我在处理您的问题时汇总的一些代码:
http://sense.qbox.io/gist/095b574569026b6d80fdbb0f4a2f66c7de844b13
关于最后一个代码块的细节,但是这里是设置。我使用在"not_analyzed"
字段上指定"N_hostip"
的映射创建了索引,因此我们不必担心令牌被更改为小写(这是人们常见的问题)因为,如果您没有指定分析器,则会使用standard analyzer,并将标记转换为所有小写字母),然后批量索引上面列出的文档:
curl -XDELETE "http://localhost:9200/yop/"
curl -XPUT "http://localhost:9200/yop/" -d'
{
"mappings": {
"doc": {
"properties": {
"N_hostip": {
"type": "string",
"index": "not_analyzed"
},
"N_marker": {
"type": "boolean"
},
"N_timeend_epoch": {
"type": "long"
}
}
}
}
}'
curl -XPOST "http://localhost:9200/yop/_bulk/" -d'
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 10, "N_marker": true, "N_hostip": "A"}
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 10, "N_marker": true, "N_hostip": "B"}
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 11, "N_marker": true, "N_hostip": "A"}
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 11, "N_marker": true, "N_hostip": "B"}
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 10, "N_marker": false, "N_hostip": "A"}
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 11, "N_marker": false, "N_hostip": "B"}
{"index": {"_index": "yop", "_type": "doc"}}
{"N_timeend_epoch": 11, "N_marker": false, "N_hostip": "B"}
'
然后我使用Sense来帮助我设置查询(自动完成功能很好地告诉你哪些块在哪些块中被允许,尽管它并不完美)。我使用顶级过滤器,因为它比查询更有效,并且这里不需要查询。另请注意,我的外部should
包含两个must
子句,每个子句包含两个term
过滤器(如果我在映射中没有使用not_analyzed
,我需要使用"N_hostip": "a"
等。因此,如果文档与should
中的两个子句中的任何一个匹配,则会返回它。
curl -XPOST "http://localhost:9200/yop/_search" -d'
{
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{ "term": { "N_hostip": "A" } },
{ "term": { "N_timeend_epoch": 10 } }
]
}
},
{
"bool": {
"must": [
{ "term": { "N_hostip": "B" } },
{ "term": { "N_timeend_epoch": 10 } }
]
}
}
]
}
}
}'
这会返回我认为您期望的内容。将它转换为Python代码应该很简单(如果还没有,请确保查看Python client。)