Question

我有一个使用 SQLAlchemy 创建的数据库。引擎是 postgres，数据库的正常用例是批量插入。根据一些指南，建议禁用约束（插入过程经过很好的测试），进行批量插入，然后重新启用约束。原因是动态创建索引和检查约束（外键等）在计算上非常昂贵。批量插入变得很昂贵，而且随着表变大，它们的开销也会明显增加。

外键约束得到维护（+易于测试），但我不确定索引：以这种方式进行批量插入不会创建索引，因此它们仅存在于“纸上”。对吗？

如果是这样，是否有一种简单的方法：

A.重新创建所有表的约束（最重要的是索引）？
手动删除所有索引，然后重新创建它们？
触发索引的“更新”？

我更喜欢将解决方案保留在 SQLAlchemy 中，但它有一个简单的 postgres 解决方案，我会接受它。

编辑：好的，似乎有一个简单的 postgres 语句可以做到这一点：

conn.execution_options(isolation_level="AUTOCOMMIT").execute(
                       "REINDEX DATABASE my_db_name;")

是否有一种“SQLAlchemy”友好的方式来做同样的事情？

示例表声明：

import sqlalchemy as db
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Table
from utils.db_utils import parse_datetime_to_short_date
Base = declarative_base()

class ProductsInfo(Base):
    __table__ = Table('products_info', Base.metadata,
                      db.Column('product_id', db.Integer, primary_key=True, autoincrement=False, 
                                 nullable=False, index=True, unique=True, comment='Unique product-id'),
                      db.Column('link', db.String(255), default=None, comment='Product URL'),
                      )
    def __repr__(self):
        return f"<ProductsInfo(product_id='{self.product_id}', link='{self.link}')>"

Answer 1

最终像这样进行批量插入：

使用 postgres 禁用表的触发器（不是所有表）
批量插入
启用表的触发器
重新索引表格

ALTER TABLE my_table DISABLE TRIGGER ALL

-- bulk insertions

ALTER TABLE my_table ENABLE TRIGGER ALL conn.execution_options(isolation_level="AUTOCOMMIT").execute("REINDEX TABLE my_table;")

在 SQLAlchemy 中重新创建表索引

1 个答案: