Question

我正在寻找类似于reddit / hackernews特别类似的东西（这似乎是许多主要网站的常见方法）处理他们的“新”列表。它似乎工作如下：

提交新链接时，会抓取一定数量的最新条目
这些查询按PER_PAGE＃划分并缓存为cachekey = newestPage1,2,3,4
点击下一个/上一个按钮会加载下一个/上一个缓存密钥

我的问题是：很难找到SQLalchemy / flask-sqlalchemy代码来获取最新条目的固定数量的查询。

我该怎么说：

q = PostDB.query(order_by('creation_time').desc()).limit(1000)
for chunkOf50Results in q:
  cache.set(CachedChunk+=1, chunkOf50Results)

Answer 1

如果在SQLAlchemy中对查询进行切片，它会自动限制所提取的数据库结果集：

limitedQuery = q[:50]

如果你先得到一个计数，你可以轻松地循环分组响应：

count = q.count()
for chunkstart in xrange(0, count, 50):
    CachedChunk += 1
    chunkend = min(chunkstart + 50, count)
    cache.set(CachedChunk, q[chunstart:chunkend])

请注意，这会导致对数据库进行多次查询。或者，您可以使用itertools.izip_longest() function生成包含50个项目的组：

from itertools import izip_longest

for chunkOf50Results in izip(*[q.yield_per(50)]*50):
     CachedChunk += 1
     cache.set(CachedChunk, chunkOf50Results)

我使用.yield_per(50)将行预取限制为批量大小，因此您不会预先获取超过每批需要的内容。

izip_longest(*[iterable]*n)技巧可以为基础迭代器提供大小为n的组：

>>> import itertools
>>> list(itertools.izip_longest(*[iter(range(7))]*3))
[(0, 1, 2), (3, 4, 5), (6, None, None)]

请注意，最后一批用None值填充，以填写批量大小。

Answer 2

解决方案是使用itertools.izip_longest使用技巧described in the docs

q = PostDB.query(order_by('creation_time').desc()).limit(1000)

query_iterators = [iter(q)] * 50

for CachedChunk, chunk_of_50 in enumerate(izip_longest(*query_iterators)):
     cache.set(CachedChunk, chunk_of_50)

此应该一次导致数据库获取最多1000篇文章，然后让您将它们拆分为50个批次并缓存它们。

如何使用Flask / SQLAlchemy将返回的结果数限制为仅1000个最新条目？

2 个答案: