如何使用耐嚼宝石的elasticsearch scroll api?

时间:2016-03-22 15:06:29

标签: ruby-on-rails ruby elasticsearch chewy-gem

我在我的ROR应用程序中使用'耐嚼'宝石进行弹性搜索。但我没有找到elasticsearch scroll api的任何文档。当我跳到记录的最后一页时,我收到了以下错误。

[500] {"error":{"root_cause":[{"type":"query_phase_execution_exception","reason":"Result window is too
large, from + size must be less than or equal to: [10000] but was [19450]. See the scroll api for a more
efficient way to request large data sets. This limit can be set by changing the [index.max_result_window]
index level parameter."}],"type":"search_phase_execution_exception","reason":"all shards failed",
"phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"recordings","node":"tgLqH_wwRUG6NmY0PCB0nA",
"reason":{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must
 be less than or equal to: [10000] but was [19450]. See the scroll api for a more efficient way to request
 large data sets. This limit can be set by changing the [index.max_result_window] index level
 parameter."}}]},"status":500}

有没有办法在耐嚼的宝石中实现elasticsearch scroll api,还是其他任何选项?

1 个答案:

答案 0 :(得分:0)

只需缩小查询大小,即可批量使用滚动:

  # @example Call the `scroll` API until all the documents are returned
  #
  #     # Index 1,000 documents
  #     client.indices.delete index: 'test'
  #     1_000.times do |i| client.index index: 'test', type: 'test', id: i+1, body: {title: "Test #{i}"} end
  #     client.indices.refresh index: 'test'
  #
  #     # Open the "view" of the index by passing the `scroll` parameter
  #     # Sorting by `_doc` makes the operations faster
  #     r = client.search index: 'test', scroll: '1m', 
              body: {size: 100, sort: ['_doc']}
  #
  #     # Display the initial results
  #     puts "--- BATCH 0 -------------------------------------------------"
  #     puts r['hits']['hits'].map { |d| d['_source']['title'] }.inspect
  #
  #     # Call the `scroll` API until empty results are returned
  #     while r = client.scroll(scroll_id: r['_scroll_id'], scroll: '5m') and not r['hits']['hits'].empty? do
  #       puts "--- BATCH #{defined?($i) ? $i += 1 : $i = 1} -------------------------------------------------"
  #       puts r['hits']['hits'].map { |d| d['_source']['title'] }.inspect
  #       puts
  #     end

使用String#index

here获取的示例