Question

我正在使用Microsoft SQL Server和Entity Framework。我要插入N个（例如10 000个）项目。在插入每个项目之前，我需要插入或更新现有组。由于性能低下，它不能很好地工作。这是因为我生成了太多查询。每次在循环中我都通过查询Groups表三（已经索引）的参数来寻找组。

我正在考虑使用WHERE IN查询（Groups.Where(g => owners.Contains(g.OwnerId) && ..）来查询所有组，但我记得这些查询受到参数数量的限制。

也许我应该写一个存储过程？

这是我的示例代码。我使用IUnitOfWork模式封装了EF DbContext：

public async Task InsertAsync(IItem item)
{
    var existingGroup = await this.unitOfWork.Groups.GetByAsync(item.OwnerId, item.Type, item.TypeId);

    if (existingGroup == null)
    {
        existingGroup = this.unitOfWork.Groups.CreateNew();
        existingGroup.State = GroupState.New;
        existingGroup.Type = item.Code;
        existingGroup.TypeId = item.TypeId;
        existingGroup.OwnerId = item.OwnerId;
        existingGroup.UpdatedAt = item.CreatedAt;

        this.unitOfWork.Groups.Insert(existingGroup);
    }
    else
    {
        existingGroup.UpdatedAt = item.CreatedAt;
        existingGroup.State = GroupState.New;

        this.unitOfWork.Groups.Update(existingGroup);
    }

    this.unitOfWork.Items.Insert(item);
}

foreach(var item in items)
{
    InsertAsync(item);
}

await this.unitOfWork.SaveChangesAsync();

Answer 1

批量插入时，有三个关键要素可以提高性能：

将AutoDetectChangesEnabled和ValidateOnSaveEnabled设为false：

_db.Configuration.AutoDetectChangesEnabled = false; _db.Configuration.ValidateOnSaveEnabled = false;

将插入内容分解为多个段，使用相同的DbContext，然后重新创建它。段应该有多大，从用例到用例，在重新创建Context之前，我在大约100个元素上取得了最佳性能。这是由于观察了DbContext中的元素。另外，请确保不要为每个插入重新创建上下文。（请参阅此处Slauma的答案Fastest Way of Inserting in Entity Framework）
检查其他表格时，请务必尽可能使用IQueryable，并仅在必要时使用ToList()或FirstOrDefault()。由于ToList()和FirstOrDefault()加载了对象。（请参阅此处Richard Szalay的答案What's the difference between IQueryable and IEnumerable）

这些技巧在你所描述的场景中批量插入时帮助了我。还有其他可能性。例如SP＆＃39;和BulkInsert函数。

实体框架批量插入，用于更新另一个表中的行

1 个答案: