Question

我正在监听流量并不断将这些流量数据插入到elasticsearch中。我想用我的python脚本搜索这些数据。

这是我的python代码的一小部分，

     test = es.search(
     index="argus_data",
     body=dict(query=search_body["query"],
                size= "1000") # I want to do this "unlimited"
  )

  pprint(test)

我不知道我的体型是多少，因为我不断有新的数据。如何管理这种情况请帮我解决这个问题，谢谢。

Answer 1

你可以这样做：

test=es.search(index=['test'],doc_type=['test'],size=1000, from_=0)

然后逐渐更改from_，直到获得所有数据。

from_ - 起始偏移量（默认值：0） Elasticsearch-Py API Documentation

Answer 2

首先使用test ['hits'] ['total']将变量命中，然后将其传递给变量。

您必须使用该查询两次。第一次使用它来获取命中数（不要传递大小参数）。

test=es.search(index=['test'],doc_type=['test'])
size=test['hits']['total']

第二次使用查询和大小

test=es.search(index=['test'],doc_type=['test'],"size":size)

Answer 3

如果观察次数超过10000，则会出现错误。

Answer 4

# The solution is to calculate the number of documents you have 

test=es.search(index=['indexname'])
size=test['hits']['total']

#size['value'] is the size of your data

# The solution is to calculate the number of documents you have then you make your query based on that size 

res = es.search(index="indexname", body={'size' : size['value'],"query": {"match_all": {}}})``

# you can loop into your data like this 

while i<size['value']:
    print(res['hits']['hits'][i]['_source']['fieldname'])
    i=i+1

Elasticsearch无限大小

4 个答案: