Question

我正在读取一个文件并构造每行读取字典。

在for循环中，我将这些词典附加到列表中，当len(list)大于10K时，我将列表发送到con.execute(add.insert(list))并清理列表。问题是最后一个dictionarys列表不是10K所以我错过了插入最后一批。

我不认为构建和记忆大型词典列表然后迭代列表将是一种处理此问题的表演方式。

在SQLAlchemy中是否有一个方法，我只是发送dictionarys并设置批量大小的大小限制，以便它自己处理负载？或者另一种解决方法......

    chunks.append(data_dict)
    if len(chunks) == 10000:
        con.execute(add.insert(chunks))
        del chunks[:]

.newb slqalchemy learner

Answer 1

对于一个天真的解决方案，您可以利用您可以访问循环外的for循环中的最后一个变量的事实。添加一些其他调整我认为这将起作用：

for counter, data_dict in enumerate(data_dict_list):
    chunks.append(data_dict)
    if counter % 10000 == 0 and counter != 0:
        con.execute(add.insert(chunks))
        chunks = []
con.execute(add.insert(chunks))

如果您想利用更多SQLAlchemy功能，此文档页面会有一个遵循类似模式的批量插入示例：

http://docs.sqlalchemy.org/en/latest/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow

SQLAlchemy按批次插入dictionarys列表

1 个答案: