Elasticsearch批量获取搜索结果?

时间:2015-06-10 20:24:42

标签: python elasticsearch

当我使用Python Elasticsearch API查询Elasticsearch时,我得到了大约5000个结果。设置"大小"搜索查询中的参数数量大于结果数量导致以下Java OOM错误:

File "MGDFinder.py", line 114, in <module>
  res = es.search(index="_all", body=queryMaker(state))
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 68, in _wrapped
  return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 440, in search
  params=params, body=body)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 276, in perform_request
  status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 55, in perform_request
  self._raise_error(response.status, raw_data)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 97, in _raise_error
  raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(500, u'OutOfMemoryError[Java heap space]')

我注意到,当大小设置为700时,会发生这种情况。我不想增加Java堆大小。有没有办法可以批量执行500次搜索?

1 个答案:

答案 0 :(得分:0)

我不认为您可以批量请求而不增加Java Heap Space,服务器仍然会存储5000个结果并返回。

我认为您可以使用scroll来获取请求,scroll可以从大量结果中快速检索,它喜欢传统数据库中的cursor

样品申请:

$ curl -XGET 'localhost:9200/world/test/_search?scroll=1m&pretty' -d '
{
    "size": 50,
    "query": {
        "match_all": {}
    }
}'

样本回复:

{
  "_scroll_id" : "cXVlcnlUaGVuRmV0Y2g7NTszNjpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzM3Oldlb2VGcldIUi1PRTZhS0gzWE5rQUE7Mzg6V2VvZUZyV0hSLU9FNmFLSDNYTmtBQTs0MDpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzM5Oldlb2VGcldIUi1PRT
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {....

结果将返回一个滚动ID,它可用于获取下一个匹配。

示例scroll请求(-d _scroll_id):

 $ curl -XGET  'localhost:9200/_search/scroll?scroll=1m&pretty' -d 'cXVlcnlUaGVuRmV0Y2g7NTszMTpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzMyOldlb2VGcldIUi1PRTZhS0gzWE5rQUE7MzM6V2VvZUZyV0hSLU9FNmFLSDNYT
 mtBQTszNDpXZW9lRnJXSFItT0U2YUtIM1hOa0FBOzM1Oldlb2VGcldIUi1PRTZhS0gzWE5rQUE7MDs='

官方文件:Scroll