我正在使用elasticsearch scroll api。在某些情况下,我想在第n页上返回点击而不返回之前的页面'命中。我相信这应该像迭代器。所以我想通过前几页传递迭代器,但实际上返回第n页的命中。
我目前的代码是
initial_request = client.search(index = index, doc_type = doc_type, body = q, scroll = str(wait_time) + 'm', search_type = 'scan', size = size)
sid = initial_request['_scroll_id'] ## scroll id
total_hits = initial_request['hits']['total'] ## how many results there are.
scroll_size = total_hits ## set this to a positive value initially
while scroll_size > 0:
p += 1
print "\t\t Scrolling to page %s ..." %p
page = client.scroll(scroll_id = sid, scroll = str(wait_time) + 'm')
sid = page['_scroll_id'] # Update the scroll ID
scroll_size = len(page["hits"]["hits"]) ## no. of hits returned on this page
## then code to do stuff w/ that page's hits.
但是page = client.scroll(...)
实际上将该页面的匹配发送回我的本地计算机。我想在前n页上pass
,然后开始发送页面'点击之后。
有什么想法吗?