在Django中一次更新多个对象?

时间:2016-04-20 17:46:28

标签: python django

我正在使用Django 1.9。我有一个Django表,表示按月组织的特定度量值,原始值和百分位数:

class MeasureValue(models.Model):
    org = models.ForeignKey(Org, null=True, blank=True)
    month = models.DateField()
    calc_value = models.FloatField(null=True, blank=True)
    percentile = models.FloatField(null=True, blank=True)

每月通常有10,000左右。我的问题是我是否可以加快在模型上设置值的过程。

目前,我通过使用Django过滤器查询检索一个月的所有度量值,将其转换为pandas数据帧,然后使用scipy的rankdata来设置等级和百分位来计算百分位数。我这样做是因为pandas和rankdata是高效的,能够忽略空值,能够以我想要的方式处理重复值,所以我对这种方法感到满意:

records = MeasureValue.objects.filter(month=month).values()
df = pd.DataFrame.from_records(records)
// use calc_value to set percentile on each row, using scipy's rankdata

但是,我需要从数据框中检索每个百分位值,并将其重新设置到模型实例上。现在我通过迭代数据帧的行并更新每个实例来实现这一点:

for i, row in df.iterrows():
    mv = MeasureValue.objects.get(org=row.org, month=month)
    if (row.percentile is None) or np.isnan(row.percentile):
        row.percentile = None
    mv.percentile = row.percentile
    mv.save()

这不足为奇。是否有任何有效的Django方法来加速它,通过单个数据库写入而不是数万个?我有checked the documentation,但看不到一个。

3 个答案:

答案 0 :(得分:15)

原子事务可以减少在循环中花费的时间:

from django.db import transaction

with transaction.atomic():
    for i, row in df.iterrows():
        mv = MeasureValue.objects.get(org=row.org, month=month)

        if (row.percentile is None) or np.isnan(row.percentile): 
            # if it's already None, why set it to None?
            row.percentile = None

        mv.percentile = row.percentile
        mv.save()

Django的默认行为是在自动提交模式下运行。除非事务处于活动状态,否则每个查询都会立即提交到数据库。

通过使用with transaction.atomic(),所有插入都被分组到一个事务中。提交事务所需的时间在所有随附的insert语句中分摊,因此每个insert语句的时间大大减少。

答案 1 :(得分:1)

从Django 2.2开始,您可以使用bulk_update() queryset方法来有效地更新所提供的模型实例上的给定字段,通常使用一个查询:

Case

在旧版本的Django中,您可以将update()When / from django.db.models import Case, When Entry.objects.filter( pk__in=headlines # `headlines` is a pk -> headline mapping ).update( headline=Case(*[When(pk=entry_pk, then=headline) for entry_pk, headline in headlines.items()])) 配合使用,例如:

# Just add types once. There is no need to add the types in each function.
Add-Type -AssemblyName System.Windows.Forms;
Add-Type -AssemblyName System.Drawing

function GetDetails() {


      $browser = New-Object System.Windows.Forms.OpenFileDialog;
      $browser.Filter = "txt (*.txt)|*.txt";
      $browser.InitialDirectory = "E:\";
      $browser.Title = "select txt file";

      $browserResult = $browser.ShowDialog();

      # Combined the if statements
      if($browserResult -eq [System.Windows.Forms.DialogResult]::OK -and [string]::IsNullOrWhiteSpace($txtFile) -ne $true) {

        $nfoFile = $browser.FileName;

        $txtFile = [System.IO.Path]::ChangeExtension($nfoFile, ".dac");
        $txtFile = $temp + [System.IO.Path]::GetFileName($txtFile);
        $exeArgs = "-f -S `"$txtFile`" -O `"$txtFile`"";

        Start-Process $anExe -ArgumentList $exeArgs -Wait;

        # The Raw flag should return a string
        $result = Get-Content $txtFile -Raw;

        $browser.Dispose();

        return $result;
      }

      # No need for else since the if statement returns
      return GetFromForm;

}

function GetFromForm(){


    $form = New-Object System.Windows.Forms.Form;
    $form.Text = 'Adding Arguments'
    $form.Size = New-Object System.Drawing.Size(816,600)
    $form.StartPosition = 'CenterScreen'

    # Added a button
    $OKButton = New-Object System.Windows.Forms.Button
    $OKButton.Location = New-Object System.Drawing.Point(585,523)
    $OKButton.Size = New-Object System.Drawing.Size(75,23)
    $OKButton.Text = 'OK'
    $OKButton.DialogResult = [System.Windows.Forms.DialogResult]::OK
    $form.AcceptButton = $OKButton
    $form.Controls.Add($OKButton)

    $txtBox = New-Object System.Windows.Forms.TextBox;
    $txtBox.Multiline = $true;
    $txtBox.AcceptsReturn = $true;
    $txtBox.AcceptsTab = $true;
    $txtBox.Visible = $true;
    $txtBox.Name = "txtName";
    $txtBox.Size = New-Object System.Drawing.Size(660,500)

    $form.Controls.Add($txtBox);  

    # Needed to force it to show on top
    $form.TopMost = $true

    # Select the textbox and activate the form to make it show with focus
    $form.Add_Shown({$txtBox.Select(), $form.Activate()})

    # Finally show the form and assign the ShowDialog method to a variable (this keeps it from printing out Cancel)
    $result = $form.ShowDialog();

    # If the user hit the OK button return the text in the textbox
    if ($result -eq [System.Windows.Forms.DialogResult]::OK)
    {
        return $txtBox.Text
    }
}

$desc = GetDetails;
cls;
Write-Host $desc;

答案 2 :(得分:0)

实际上,尝试@Eugene Yarmash 的回答我发现我收到了这个错误:

FieldError: Joined field references are not permitted in this query

但我相信迭代 update 仍然比多次保存要快,我希望使用事务也应该加快速度。

因此,对于不提供 bulk_update 的 Django 版本,假设 Eugene 的答案中使用的数据相同,其中 headlines 是 pk -> 标题映射:

from django.db import transaction

with transaction.atomic():
    for entry_pk, headline in headlines.items():
        Entry.objects.filter(pk=entry_pk).update(headline=headline)