我在ES中有信息。映射非常简单:
{
"index": {
"aliases": {},
"mappings": {
"level1": {
"properties": {
"id": {
"type": "string"
},
"level2": {
"type": "nested",
"properties": {
"level3": {
"type": "nested",
"properties": {
"value1": {
"type": "string"
},
"value2": {
"type": "long"
},
"id": {
"type": "string"
},
"value3": {
"type": "long"
}
}
},
"id": {
"type": "string"
}
}
}
}
}
},
"settings": {
"index": {
"creation_date": "1505476515647",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "_0IiQCPrQ1i-kDP1481y8w",
"version": {
"created": "2030099"
}
}
},
"warmers": {}
}
}
当我进行查询时:
{"query": {"terms": {"_id": [ "value51" ] }}}
我收到了这种结构的数据:
_source (dict)
level1 (list)
level2 (list)
data1 (dict)
id
value1
value2
value3
data2 (dict)
data3 (dict)
...
data65000 (dict)
问题是65,000个数据太多,而且内存耗尽,我想知道_search或ElasticSearch是否有一些方法可以批量处理这些信息(data1,data2,data3 ......) 。或者,如果有某种方法来进行该查询,以便我不会在计算机上耗尽内存。有什么想法吗?
谢谢!
答案 0 :(得分:0)
您可以使用source filtering功能:只需配置字段列表,例如:
{
"_source": {
"includes": [
"data1*"
]
},
"query": {
"terms": {
"_id": [
"value51"
]
}
}
}
然后:
{
"_source": {
"includes": [
"data2*"
]
},
"query": {
"terms": {
"_id": [
"value51"
]
}
}
}
但由于多次查询,它可能会降低性能。