google app引擎NDB记录NDB模型中的计数

时间:2018-10-23 10:38:05

标签: google-app-engine app-engine-ndb

我们可以通过一次查询从Google App Engine中获得多少条记录,以便我们可以向用户显示计数,并且可以将超时限制从3秒增加到5秒

1 个答案:

答案 0 :(得分:1)

以我的经验,ndb一次最多不能提取1000条记录。这是一个示例示例,如果我尝试在包含约500,000条记录的表上使用.count()

s~project-id> models.Transaction.query().count()
WARNING:root:suspended generator _count_async(query.py:1330) raised AssertionError()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/utils.py", line 160, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/query.py", line 1287, in count
    return self.count_async(limit, **q_options).get_result()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
    self.check_success()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 427, in _help_tasklet_along
    value = gen.throw(exc.__class__, exc, tb)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/query.py", line 1330, in _count_async
    batch = yield rpc
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 513, in _on_rpc_completion
    result = rpc.get_result()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 614, in get_result
    return self.__get_result_hook(self)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/datastore/datastore_query.py", line 2910, in __query_result_hook
    self._batch_shared.conn.check_rpc_success(rpc)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/datastore/datastore_rpc.py", line 1377, in check_rpc_success
    rpc.check_success()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 580, in check_success
    self.__rpc.CheckSuccess()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/api/apiproxy_rpc.py", line 157, in _WaitImpl
    self.request, self.response)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/remote_api/remote_api_stub.py", line 308, in MakeSyncCall
    handler(request, response)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/remote_api/remote_api_stub.py", line 362, in _Dynamic_Next
    assert next_request.offset() == 0
AssertionError

要通过此操作,您可以执行以下操作:

objs = []
q = None
more = True
while more:
    _objs, q, more = models.Transaction.query().fetch_page(300, start_cursor=q)
    objs.extend(_objs)

但是即使那样,最终也会达到内存/超时限制。

当前,我使用Google Dataflow预先计算这些值,并将结果作为模型DaySummariesStatsPerUser

存储在数据存储区中

编辑:

snakecharmerb是正确的。我可以在生产环境中使用.count(),但是必须计算的实体越多,所需的时间就越长。这是我的日志查看器的屏幕截图,其中花了大约15秒的时间才能计算大约330,000条记录

enter image description here

当我尝试向该查询添加一个返回计数为〜4500的过滤器时,运行了大约一秒钟。

编辑#2:

好吧,我还有另一个App Engine项目,该项目的记录约为8,000,000。我尝试在http请求处理程序中对此进行.count(),并且运行60秒后请求超时。

enter image description here