在Google App Engine中以for循环查询的有效方式?

时间:2012-08-27 13:00:35

标签: python google-app-engine google-cloud-datastore

在GAE文件中,它声明:

  

因为每个get()或put()操作都会调用一个单独的远程   过程调用(RPC),在循环内发出许多这样的调用是一个   一次处理实体或密钥集合的低效方法。

谁知道我的代码中有多少其他低效率,所以我想尽量减少。目前,我确实有一个for循环,每个迭代都有一个单独的查询。假设我有一个用户,一个用户有朋友。我想获得用户的每个朋友的最新更新。所以我拥有的是该用户朋友的数组:

for friend_dic in friends:
        email = friend_dic['email']
        lastUpdated = friend_dic['lastUpdated']
        userKey = Key('User', email)
        query = ndb.gql('SELECT * FROM StatusUpdates WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastUpdated)
        qit = query.iter()
        while (yield qit.has_next_async()):
           status = qit.next()
           status_list.append(status.to_dict())
raise ndb.Return(status_list)

有没有更有效的方法来做到这一点,也许以某种方式将所有这些都分成一个单一的查询?

2 个答案:

答案 0 :(得分:4)

尝试查看NDB的地图功能:https://developers.google.com/appengine/docs/python/ndb/queryclass#Query_map_async

示例(假设您将朋友关系保存在单独的模型中,对于此示例,我假设Relationships模型):

@ndb.tasklet
def callback(entity):
  email = friend_dic['email']
  lastUpdated = friend_dic['lastUpdated']
  userKey = Key('User', email)
  query = ndb.gql('SELECT * FROM StatusUpdates WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastUpdated)
  status_updates = yield query.fetch_async()
  raise ndb.Return(status_updates)

qry = ndb.gql("SELECT * FROM Relationships WHERE friend_to = :1", user.key)
updates = yield qry.map_async(callback)
#updates will now be a list of status updates

更新

更好地了解您的数据模型:

queries = []
status_list = []
for friend_dic in friends:
  email = friend_dic['email']
  lastUpdated = friend_dic['lastUpdated']
  userKey = Key('User', email)
  queries.append(ndb.gql('SELECT * FROM StatusUpdates WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastUpdated).fetch_async())

for query in queries:
  statuses = yield query
  status_list.extend([x.to_dict() for x in statuses])

raise ndb.Return(status_list)

答案 1 :(得分:1)

您可以使用ndb异步方法同时执行这些查询:

from google.appengine.ext import ndb

class Bar(ndb.Model):
   pass

class Foo(ndb.Model):
   pass

bars = ndb.put_multi([Bar() for i in range(10)])
ndb.put_multi([Foo(parent=bar) for bar in bars])

futures = [Foo.query(ancestor=bar).fetch_async(10) for bar in bars]
for f in futures:
  print(f.get_result())

这将启动10个并发数据存储查询RPC,总体延迟仅取决于最慢的延迟而不是所有延迟的总和

另请参阅官方ndb documentation以获取有关如何使用ndb异步API的更多详细信息。