Question

我想编写一个数据迁移，我会以较小的批量修改大表中的所有行，以避免锁定问题。但是，我无法弄清楚如何在Django迁移中手动提交。每当我尝试运行commit时，我得到：

TransactionManagementError：当一个＆＃39; atomic＆＃39;阻止活跃。

AFAICT，database schema editor always wraps Postgres migrations中的atomic block。

是否有一种明智的方法可以在迁移过程中突破交易？

我的迁移看起来像这样：

def modify_data(apps, schema_editor):
    counter = 0
    BigData = apps.get_model("app", "BigData")
    for row in BigData.objects.iterator():
        # Modify row [...]
        row.save()
        # Commit every 1000 rows
        counter += 1
        if counter % 1000 == 0:
            transaction.commit()
    transaction.commit()

class Migration(migrations.Migration):
    operations = [
        migrations.RunPython(modify_data),
    ]

我使用的是Django 1.7和Postgres 9.3。这曾经适用于南方和旧版本的Django。

Answer 1

我发现的最佳解决方法是在运行数据迁移之前手动退出原子范围：

def modify_data(apps, schema_editor):
    schema_editor.atomic.__exit__(None, None, None)
    # [...]

与手动重置connection.in_atomic_block相反，这允许在迁移中使用atomic上下文管理器。似乎没有更多的方式。

可以在装饰器中包含（通常是凌乱的）事务中断逻辑，以便与RunPython操作一起使用：

def non_atomic_migration(func):
  """
  Close a transaction from within code that is marked atomic. This is
  required to break out of a transaction scope that is automatically wrapped
  around each migration by the schema editor. This should only be used when
  committing manually inside a data migration. Note that it doesn't re-enter
  the atomic block afterwards.
  """
  @wraps(func)
  def wrapper(apps, schema_editor):
      if schema_editor.connection.in_atomic_block:
          schema_editor.atomic.__exit__(None, None, None)
      return func(apps, schema_editor)
  return wrapper

<强>更新

Django 1.10将支持non-atomic migrations。

Answer 2

来自the documentation about RunPython：

默认情况下，RunPython将在不支持DDL事务的数据库（例如，MySQL和Oracle）上的事务内运行其内容。这应该是安全的，但如果您尝试使用这些后端上提供的schema_editor，可能会导致崩溃;在这种情况下，将atomic = False传递给RunPython操作。

所以，而不是你得到的东西：

class Migration(migrations.Migration):
  operations = [
      migrations.RunPython(modify_data, atomic=False),
  ]

Answer 3

对于遇到此问题的其他人。您可以在同一迁移中同时拥有两个数据（RunPython）。只要确保所有alter表都位于第一位即可。您不能在任何ALTER TABLE之前执行RunPython。

在Django数据迁移中手动提交

3 个答案: