Question

我需要在Django中迭代大型集合（3 * 10 ^ 6个元素）来进行某种使用单个SQL语句无法完成的分析。

是否可以关闭django中的集合缓存？（缓存所有数据不可接受的数据大约为0.5GB）
是否可以在块中制作django fetch集合？它似乎试图预先将整个集合提取到内存中，然后迭代它。我认为观察执行的速度：
- iter(Coll.objects.all()).next() - 这需要永远
- iter(Coll.objects.all()[:10000]).next() - 这需要不到一秒的时间

Answer 1

使用QuerySet.iterator()遍历结果，而不是先加载所有结果。

Answer 2

它认为问题是由不支持读取数据块的数据库后端（sqlite）引起的。我已经使用了sqlite，因为在我进行所有计算之后数据库将被删除，但似乎sqlite即使对此也不好。

以下是我在sqlite后端的django源代码中找到的内容：

class DatabaseFeatures(BaseDatabaseFeatures):
    # SQLite cannot handle us only partially reading from a cursor's result set
    # and then writing the same rows to the database in another cursor. This
    # setting ensures we always read result sets fully into memory all in one
    # go.
    can_use_chunked_reads = False

迭代django中的大集合 - 缓存问题

2 个答案: