Question

由于我的数据库的特定设计结构，我在EF中的性能非常差。以下是相关关系：

ERD

我有以下数据模型：

public class Sensor
{
    [Key]
    public int Id { get; set; }

    [Required, MaxLength(64)]
    public string Name { get; set; }

    [Required, ForeignKey("Type")]
    public int SensorTypeId { get; set; }

    public virtual SensorType Type { get; set; }

    public virtual ICollection<SensorSample> SensorSamples { get; set; }
}

public class SensorSample
{
    [Key]
    public int Id { get; set; }

    [Required, ForeignKey("Sensor")]
    public int SensorId { get; set; }

    public virtual Sensor Sensor { get; set; }

    [Required]
    public DateTime SampleTime { get; set; }

    [Required]
    public virtual ICollection<SampleData> SampleData { get; set; }
}

public class SampleData
{
    [Key]
    public int Id { get; set; }

    [Required, ForeignKey("DataType")]
    public int SampleDataTypeId { get; set; }

    public virtual SampleDataType DataType { get; set; }

    [Required, ForeignKey("Unit")]
    public int SampleUnitId { get; set; }

    public virtual SampleUnit Unit { get; set; }

    [Required, ForeignKey("Sample")]
    public int SensorSampleId { get; set; }

    public virtual SensorSample Sample { get; set; }

    [MaxLength(128)]
    public string Value { get; set; }
}

由于SensorSample可以有多种数据样本类型（即温度，压力等），INSERT必须查询现有样本以与正确的SampleTime建立适当的关联。这是使用以下代码完成的：

SensorSample sample = null;
foreach (var d in input)
{
    SampleData data = new SampleData();
    data.SampleDataTypeId = dataTypeId;
    data.SampleUnitId = unitId;
    data.Value = d.Value;

    // check for existing sample for this sensor and timestamp
    sample = SensorSamples.FirstOrDefault(s => s.SensorId == sensor.Id && s.SampleTime == d.Timestamp);
    if (sample == null)
    {
        // sample doesn't exist, create a new one
        sample = new SensorSample();
        sample.SampleTime = d.Timestamp;
        sample.SensorId = sensor.Id;
        sensor.SensorSamples.Add(sample);
    }
    // add the data to the sample
    sample.SampleData.Add(data);
}

我尝试通过分批（即一次1000条记录）优化插入样本数据。这确实有帮助，但即使SampleTime字段上有索引，查找查询似乎需要更长时间，因为添加了更多记录。

所以，我的问题是，如何改进向数据库添加样本数据的设计和/或性能？是否有更好的数据库结构来处理一对多关系？如果我能在性能上得到适当的偏移，我愿意在数据库设计上做出一些妥协，但我仍然需要能够处理与给定SampleTime相关的不同数据。

Answer 1

实体框架维护所有本地实体的本地缓存，并跟踪在这些实体中进行的任何更改。随着实体数量的增加，检查变得更加昂贵。

Here是关于DetectChanges如何运作以及你能做些什么的非常有趣的系列文章。特别注意第3部分。

当我需要批量加载大量数据时，我会禁用DetectChanges并在保存后清除本地缓存，以便释放内存：

    public static void ClearDbSet<T>(this DbContext context) where T : class {
        var entries = context.ChangeTracker.Entries<T>().Where(e => e.State == EntityState.Unchanged);

        foreach (DbEntityEntry<T> entry in entries.ToList()) {
            entry.State = EntityState.Detached;
        }
    }

ToList调用是必要的，否则迭代器将抛出异常。

Answer 2

最大化测试数据的LOAD性能

    DONT run project in Debug mode (multiple factor slower for EF)

使用以下设置：

    Context.Configuration.LazyLoadingEnabled = false;
    Context.Configuration.ProxyCreationEnabled = false;
    Context.Configuration.AutoDetectChangesEnabled = false;
    Context.Configuration.ValidateOnSaveEnabled = false;

每100个条目或更少，丢弃上下文。

 Using( new context)

试

Context.Set<TPoco>().AddOrUpdate(poco);

而不是

   Context.Set<TPoco>().firstorDefault(lamba);
   Context.Set<TPoco>().Add(poco);

Answer 3

EF6 beta 1具有适合您目的的AddRange功能：

INSERTing many rows with Entity Framework 6 beta 1

请注意，我链接到的文章指的是在EF5中将AutoDetectChangesEnabled设置为false的技术，@felipe指的是

提高在EF中为一对多关系插入数据的性能

3 个答案: