实体框架查询性能与原始SQL执行的{ext}不同

时间:2016-02-16 23:59:16

标签: c# sql-server entity-framework sql-server-2008 entity-framework-6

我对实体框架查询执行性能有疑问。

模式

我有一个像这样的表结构:

CREATE TABLE [dbo].[DataLogger]
(
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [ProjectID] [bigint] NULL,
    CONSTRAINT [PrimaryKey1] PRIMARY KEY CLUSTERED ( [ID] ASC )
)

CREATE TABLE [dbo].[DCDistributionBox]
(
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [DataLoggerID] [bigint] NOT NULL,
    CONSTRAINT [PrimaryKey2] PRIMARY KEY CLUSTERED ( [ID] ASC )
)

ALTER TABLE [dbo].[DCDistributionBox]
    ADD CONSTRAINT [FK_DCDistributionBox_DataLogger] 
    FOREIGN KEY([DataLoggerID]) REFERENCES [dbo].[DataLogger] ([ID])

CREATE TABLE [dbo].[DCString] 
(
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [DCDistributionBoxID] [bigint] NOT NULL,
    [CurrentMPP] [decimal](18, 2) NULL,
    CONSTRAINT [PrimaryKey3] PRIMARY KEY CLUSTERED ( [ID] ASC )
)

ALTER TABLE [dbo].[DCString]
    ADD CONSTRAINT [FK_DCString_DCDistributionBox] 
    FOREIGN KEY([DCDistributionBoxID]) REFERENCES [dbo].[DCDistributionBox] ([ID])

CREATE TABLE [dbo].[StringData]
(
    [DCStringID] [bigint] NOT NULL,
    [TimeStamp] [datetime] NOT NULL,
    [DCCurrent] [decimal](18, 2) NULL,
    CONSTRAINT [PrimaryKey4] PRIMARY KEY CLUSTERED ( [TimeStamp] DESC, [DCStringID] ASC)
)

CREATE NONCLUSTERED INDEX [TimeStamp_DCCurrent-NonClusteredIndex] 
ON [dbo].[StringData] ([DCStringID] ASC, [TimeStamp] ASC)
INCLUDE ([DCCurrent])

外键上的标准索引也存在(我不想出于空间原因列出所有索引)。

[StringData]表有以下存储统计信息:

  • 数据空间:26,901.86 MB
  • 行数:131,827,749
  • 分区:true
  • 分区数:62

用法

我现在想要对[StringData]表中的数据进行分组并进行一些聚合。

我创建了一个实体框架查询(查询的详细信息可以找到here):

var compareData = model.StringDatas
    .AsNoTracking()
    .Where(p => p.DCString.DCDistributionBox.DataLogger.ProjectID == projectID && p.TimeStamp >= fromDate && p.TimeStamp < tillDate)
    .Select(d => new
    {
        TimeStamp = d.TimeStamp,
        DCCurrentMpp = d.DCCurrent / d.DCString.CurrentMPP
    })
    .GroupBy(d => DbFunctions.AddMinutes(DateTime.MinValue, DbFunctions.DiffMinutes(DateTime.MinValue, d.TimeStamp) / minuteInterval * minuteInterval))
    .Select(d => new
    {
        TimeStamp = d.Key,
        DCCurrentMppMin = d.Min(v => v.DCCurrentMpp),
        DCCurrentMppMax = d.Max(v => v.DCCurrentMpp),
        DCCurrentMppAvg = d.Average(v => v.DCCurrentMpp),
        DCCurrentMppStDev = DbFunctions.StandardDeviationP(d.Select(v => v.DCCurrentMpp))
    })
    .ToList();

执行时间跨度非常长!?

  • 执行结果:92rows
  • 执行时间:~16000ms

尝试

我现在看一下Entity Framework生成的SQL查询,看起来像这样:

DECLARE @p__linq__4 DATETIME = 0;
DECLARE @p__linq__3 DATETIME = 0;
DECLARE @p__linq__5 INT = 15;
DECLARE @p__linq__6 INT = 15;
DECLARE @p__linq__0 BIGINT = 20827;
DECLARE @p__linq__1 DATETIME = '06.02.2016 00:00:00';
DECLARE @p__linq__2 DATETIME = '07.02.2016 00:00:00';

SELECT 
1 AS [C1], 
[GroupBy1].[K1] AS [C2], 
[GroupBy1].[A1] AS [C3], 
[GroupBy1].[A2] AS [C4], 
[GroupBy1].[A3] AS [C5], 
[GroupBy1].[A4] AS [C6]
FROM ( SELECT 
    [Project1].[K1] AS [K1], 
    MIN([Project1].[A1]) AS [A1], 
    MAX([Project1].[A2]) AS [A2], 
    AVG([Project1].[A3]) AS [A3], 
    STDEVP([Project1].[A4]) AS [A4]
    FROM ( SELECT 
        DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) AS [K1], 
        [Project1].[C1] AS [A1], 
        [Project1].[C1] AS [A2], 
        [Project1].[C1] AS [A3], 
        [Project1].[C1] AS [A4]
        FROM ( SELECT 
            [Extent1].[TimeStamp] AS [TimeStamp], 
            [Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]
            FROM    [dbo].[StringData] AS [Extent1]
            INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
            INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
            INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
            WHERE (([Extent4].[ProjectID] = @p__linq__0) OR (([Extent4].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)
        )  AS [Project1]
    )  AS [Project1]
    GROUP BY [K1]
)  AS [GroupBy1]

我将此SQL查询复制到同一台计算机上的SSMS中,并使用与实体框架相同的连接字符串连接。

结果是性能大大提高:

  • 执行结果:92rows
  • 执行时间:517毫秒

我也做了一些循环运行测试,结果很奇怪。测试看起来像这样

for (int i = 0; i < 50; i++)
{
    DateTime begin = DateTime.UtcNow;

    [...query...]

    TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;
    Debug.WriteLine("{0}th run: {1}", i, excecutionTimeSpan.ToString());
}

结果非常不同,看起来是随机的(?):

0th run: 00:00:11.0618580
1th run: 00:00:11.3339467
2th run: 00:00:10.0000676
3th run: 00:00:10.1508140
4th run: 00:00:09.2041939
5th run: 00:00:07.6710321
6th run: 00:00:10.3386312
7th run: 00:00:17.3422765
8th run: 00:00:13.8620557
9th run: 00:00:14.9041528
10th run: 00:00:12.7772906
11th run: 00:00:17.0170235
12th run: 00:00:14.7773750

问题

为什么Entity Framework查询执行速度如此之慢?生成的行数非常低,原始SQL查询显示非常快的性能。

更新1

我注意它不是MetaContext或模型创建延迟。其他一些查询在之前的同一个Model实例上执行,具有良好的性能。

更新2 (与@ x0007me的答案相关):

感谢提示,但可以通过更改模型设置来消除这一点:

modelContext.Configuration.UseDatabaseNullSemantics = true;

EF生成的SQL现在是:

SELECT 
1 AS [C1], 
[GroupBy1].[K1] AS [C2], 
[GroupBy1].[A1] AS [C3], 
[GroupBy1].[A2] AS [C4], 
[GroupBy1].[A3] AS [C5], 
[GroupBy1].[A4] AS [C6]
FROM ( SELECT 
    [Project1].[K1] AS [K1], 
    MIN([Project1].[A1]) AS [A1], 
    MAX([Project1].[A2]) AS [A2], 
    AVG([Project1].[A3]) AS [A3], 
    STDEVP([Project1].[A4]) AS [A4]
    FROM ( SELECT 
        DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) AS [K1], 
        [Project1].[C1] AS [A1], 
        [Project1].[C1] AS [A2], 
        [Project1].[C1] AS [A3], 
        [Project1].[C1] AS [A4]
        FROM ( SELECT 
            [Extent1].[TimeStamp] AS [TimeStamp], 
            [Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]
            FROM    [dbo].[StringData] AS [Extent1]
            INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
            INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
            INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
            WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)
        )  AS [Project1]
    )  AS [Project1]
    GROUP BY [K1]
)  AS [GroupBy1]

因此,您可以看到您所描述的问题现已解决,但执行时间不会改变。

此外,正如您在架构和原始执行时间中所看到的,我使用了具有高度优化索引器的优化结构。

更新3 (与@Vladimir Baranov的答案有关):

我不明白为什么这可能与查询计划缓存有关。因为在MSDN中清楚地描述了EF6使用查询计划缓存。

一个简单的测试证明,巨大的执行时间differenz与查询计划缓存(伪代码)无关:

using(var modelContext = new ModelContext())
{
    modelContext.Query(); //1th run activates caching

    modelContext.Query(); //2th used cached plan
}

结果,两个查询都以相同的执行时间运行。

更新4 (与@bubi的答案相关):

我尝试在您描述它时运行EF生成的查询:

int result = model.Database.ExecuteSqlCommand(@"SELECT 
    1 AS [C1], 
    [GroupBy1].[K1] AS [C2], 
    [GroupBy1].[A1] AS [C3], 
    [GroupBy1].[A2] AS [C4], 
    [GroupBy1].[A3] AS [C5], 
    [GroupBy1].[A4] AS [C6]
    FROM ( SELECT 
        [Project1].[K1] AS [K1], 
        MIN([Project1].[A1]) AS [A1], 
        MAX([Project1].[A2]) AS [A2], 
        AVG([Project1].[A3]) AS [A3], 
        STDEVP([Project1].[A4]) AS [A4]
        FROM ( SELECT 
            DATEADD (minute, ((DATEDIFF (minute, 0, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, 0) AS [K1], 
            [Project1].[C1] AS [A1], 
            [Project1].[C1] AS [A2], 
            [Project1].[C1] AS [A3], 
            [Project1].[C1] AS [A4]
            FROM ( SELECT 
                [Extent1].[TimeStamp] AS [TimeStamp], 
                [Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]
                FROM    [dbo].[StringData] AS [Extent1]
                INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
                INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
                INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
                WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)
            )  AS [Project1]
        )  AS [Project1]
        GROUP BY [K1]
    )  AS [GroupBy1]",
    new SqlParameter("p__linq__0", 20827),
    new SqlParameter("p__linq__1", fromDate),
    new SqlParameter("p__linq__2", tillDate),
    new SqlParameter("p__linq__5", 15),
    new SqlParameter("p__linq__6", 15));
  • 执行结果:92
  • 执行时间:~16000ms

只要正常的EF查询就花了精确的时间!?

更新5 (与@vittore的回答有关):

我创建了一个跟踪调用树,也许它会有所帮助:

call tree trace

更新6 (与@usr的答案相关):

我通过SQL Server Profiler创建了两个showplan XML。

Fast run (SSMS).SQLPlan

Slow run (EF).SQLPlan

更新7 (与@VladimirBaranov的评论相关):

我现在运行一些与您的评论相关的测试用例。

首先,我通过使用新的计算列和匹配的INDEXER来完成订单操作的时间。这减少了与DATEADD(MINUTE, DATEDIFF(MINUTE, 0, [TimeStamp] ) / 15* 15, 0)相关的性能滞后。详细了解如何以及为何找到here

结果看起来像这样:

Pure EntityFramework查询:

for (int i = 0; i < 3; i++)
{
    DateTime begin = DateTime.UtcNow;
    var result = model.StringDatas
        .AsNoTracking()
        .Where(p => p.DCString.DCDistributionBox.DataLogger.ProjectID == projectID && p.TimeStamp15Minutes >= fromDate && p.TimeStamp15Minutes < tillDate)
        .Select(d => new
        {
            TimeStamp = d.TimeStamp15Minutes,
            DCCurrentMpp = d.DCCurrent / d.DCString.CurrentMPP
        })
        .GroupBy(d => d.TimeStamp)
        .Select(d => new
        {
            TimeStamp = d.Key,
            DCCurrentMppMin = d.Min(v => v.DCCurrentMpp),
            DCCurrentMppMax = d.Max(v => v.DCCurrentMpp),
            DCCurrentMppAvg = d.Average(v => v.DCCurrentMpp),
            DCCurrentMppStDev = DbFunctions.StandardDeviationP(d.Select(v => v.DCCurrentMpp))
        })
        .ToList();

        TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;
        Debug.WriteLine("{0}th run pure EF: {1}", i, excecutionTimeSpan.ToString());
}

第0次运行纯EF: 00:00:12.6460624

第1次运行纯EF: 00:00:11.0258393

第2次运行纯EF: 00:00:08.4171044

我现在使用EF生成的SQL作为SQL查询:

for (int i = 0; i < 3; i++)
{
    DateTime begin = DateTime.UtcNow;
    int result = model.Database.ExecuteSqlCommand(@"SELECT 
        1 AS [C1], 
        [GroupBy1].[K1] AS [TimeStamp15Minutes], 
        [GroupBy1].[A1] AS [C2], 
        [GroupBy1].[A2] AS [C3], 
        [GroupBy1].[A3] AS [C4], 
        [GroupBy1].[A4] AS [C5]
        FROM ( SELECT 
            [Project1].[TimeStamp15Minutes] AS [K1], 
            MIN([Project1].[C1]) AS [A1], 
            MAX([Project1].[C1]) AS [A2], 
            AVG([Project1].[C1]) AS [A3], 
            STDEVP([Project1].[C1]) AS [A4]
            FROM ( SELECT 
                [Extent1].[TimeStamp15Minutes] AS [TimeStamp15Minutes], 
                [Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]
                FROM    [dbo].[StringData] AS [Extent1]
                INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
                INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
                INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
                WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp15Minutes] >= @p__linq__1) AND ([Extent1].[TimeStamp15Minutes] < @p__linq__2)
            )  AS [Project1]
            GROUP BY [Project1].[TimeStamp15Minutes]
        )  AS [GroupBy1];",
    new SqlParameter("p__linq__0", 20827),
    new SqlParameter("p__linq__1", fromDate),
    new SqlParameter("p__linq__2", tillDate));

    TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;
    Debug.WriteLine("{0}th run: {1}", i, excecutionTimeSpan.ToString());
}

第0次运行: 00:00:00.8381200

第1次运行: 00:00:00.6920736

第2次运行: 00:00:00.7081006

OPTION(RECOMPILE)

for (int i = 0; i < 3; i++)
{
    DateTime begin = DateTime.UtcNow;
    int result = model.Database.ExecuteSqlCommand(@"SELECT 
        1 AS [C1], 
        [GroupBy1].[K1] AS [TimeStamp15Minutes], 
        [GroupBy1].[A1] AS [C2], 
        [GroupBy1].[A2] AS [C3], 
        [GroupBy1].[A3] AS [C4], 
        [GroupBy1].[A4] AS [C5]
        FROM ( SELECT 
            [Project1].[TimeStamp15Minutes] AS [K1], 
            MIN([Project1].[C1]) AS [A1], 
            MAX([Project1].[C1]) AS [A2], 
            AVG([Project1].[C1]) AS [A3], 
            STDEVP([Project1].[C1]) AS [A4]
            FROM ( SELECT 
                [Extent1].[TimeStamp15Minutes] AS [TimeStamp15Minutes], 
                [Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]
                FROM    [dbo].[StringData] AS [Extent1]
                INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
                INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
                INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
                WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp15Minutes] >= @p__linq__1) AND ([Extent1].[TimeStamp15Minutes] < @p__linq__2)
            )  AS [Project1]
            GROUP BY [Project1].[TimeStamp15Minutes]
        )  AS [GroupBy1]
        OPTION(RECOMPILE);",
    new SqlParameter("p__linq__0", 20827),
    new SqlParameter("p__linq__1", fromDate),
    new SqlParameter("p__linq__2", tillDate));

    TimeSpan excecutionTimeSpan = DateTime.UtcNow - begin;
    Debug.WriteLine("{0}th run: {1}", i, excecutionTimeSpan.ToString());
}

使用RECOMPILE进行第0次运行: 00:00:00.8260932

使用RECOMPILE进行第1次运行: 00:00:00.9139730

使用RECOMPILE进行第2次运行: 00:00:01.0680665

SSMS中的相同SQL查询(没有RECOMPILE):

00:00:01.105

在SSMS中使用相同的SQL查询(使用RECOMPILE):

00:00:00.902

我希望这些都是你需要的价值。

4 个答案:

答案 0 :(得分:12)

在这个答案中,我专注于原始观察:EF生成的查询速度很慢,但是当在SSMS中运行相同的查询时,它很快。

此行为的一种可能解释是Parameter sniffing

  

SQL Server在执行时使用称为参数嗅探的进程   具有参数的存储过程。当。。。的时候   程序被编译或重新编译,传递给的值   参数被评估并用于创建执行计划。那   然后将值与执行计划一起存储在计划缓存中。上   后续执行,使用相同的值 - 和相同的计划 -

因此,EF会生成一个参数很少的查询。第一次运行此查询时,服务器使用在第一次运行中生效的参数值为此查询创建执行计划。那个计划通常都很好。但是,稍后您使用其他参数值运行相同的EF查询。对于新的参数值,先前生成的计划可能不是最优的,并且查询变慢。服务器继续使用以前的计划,因为它仍然是相同的查询,只是参数值不同。

如果此时您获取查询文本并尝试直接在SSMS中运行它,服务器将创建一个新的执行计划,因为从技术上讲,它与EF应用程序发出的查询不同。即使一个字符差异就足够了,会话设置的任何更改也足以让服务器将查询视为新查询。因此,服务器在其缓存中有两个看似相同的查询计划。第一个“慢”计划对于参数的新值来说很慢,因为它最初是为不同的参数值构建的。第二个“快速”计划是针对当前参数值构建的,因此速度很快。

Erland Sommarskog的文章Slow in the Application, Fast in SSMS更详细地解释了这个和其他相关领域。

有几种方法可以放弃缓存的计划并强制服务器重新生成它们。更改表或更改表索引应该这样做 - 它应该丢弃与此表相关的所有计划,包括“慢”和“快”。然后在EF应用程序中使用新的参数值运行查询并获得新的“快速”计划。您在SSMS中运行查询并获得具有新参数值的第二个“快速”计划。服务器仍然会生成两个计划,但现在两个计划都很快。

另一种变体是向查询添加OPTION(RECOMPILE)。使用此选项,服务器不会将生成的计划存储在其缓存中。因此,每次查询运行时,服务器都会使用实际参数值来生成(它认为)对于给定参数值最佳的计划。缺点是计划生成的额外开销。

请注意,例如,由于过时的统计信息,服务器仍然可以选择使用此选项的“不良”计划。但是,至少参数嗅探不会成为问题。

答案 1 :(得分:2)

我知道我在这里有点迟,但由于我参与了相关查询的构建,我觉得有必要采取一些行动。

我在Linq to Entities查询中看到的一般问题是,我们构建它们的典型方式会引入不必要的参数,这可能会影响缓存的数据库查询计划(所谓的 Sql Server参数嗅探问题)。

让我们按表达式

查看您的查询组
d => DbFunctions.AddMinutes(DateTime.MinValue, DbFunctions.DiffMinutes(DateTime.MinValue, d.TimeStamp) / minuteInterval * minuteInterval)

由于minuteInterval是变量(即非常数),因此它引入了一个参数。对于DateTime.MinValue也是如此(请注意,基本类型会显示与常量相似的内容,但对于DateTimedecimal等,它们是静态只读字段,这在表达式中对待它们的方式有很大差异)。

但无论它在CLR系统中如何表示,DateTime.MinValue在逻辑上都是常量。那么minuteInterval呢,这取决于你的使用情况。

我解决问题的尝试是消除与该表达式相关的所有参数。由于我们无法使用编译器生成的表达式执行此操作,因此我们需要使用System.Linq.Expressions手动构建它。后者不直观,但幸运的是我们可以使用混合方法。

首先,我们需要一个帮助方法,它允许我们替换表达式参数:

public static class ExpressionUtils
{
    public static Expression ReplaceParemeter(this Expression expression, ParameterExpression source, Expression target)
    {
        return new ParameterReplacer { Source = source, Target = target }.Visit(expression);
    }

    class ParameterReplacer : ExpressionVisitor
    {
        public ParameterExpression Source;
        public Expression Target;
        protected override Expression VisitParameter(ParameterExpression node)
        {
            return node == Source ? Target : base.VisitParameter(node);
        }
    }
}

现在我们拥有了所需的一切。让逻辑封装在自定义方法中:

public static class QueryableUtils
{
    public static IQueryable<IGrouping<DateTime, T>> GroupBy<T>(this IQueryable<T> source, Expression<Func<T, DateTime>> dateSelector, int minuteInterval)
    {
        Expression<Func<DateTime, DateTime, int, DateTime>> expr = (date, baseDate, interval) =>
            DbFunctions.AddMinutes(baseDate, DbFunctions.DiffMinutes(baseDate, date) / interval).Value;
        var selector = Expression.Lambda<Func<T, DateTime>>(
            expr.Body
            .ReplaceParemeter(expr.Parameters[0], dateSelector.Body)
            .ReplaceParemeter(expr.Parameters[1], Expression.Constant(DateTime.MinValue))
            .ReplaceParemeter(expr.Parameters[2], Expression.Constant(minuteInterval))
            , dateSelector.Parameters[0]
        );
        return source.GroupBy(selector);
    }
}

最后,替换

.GroupBy(d => DbFunctions.AddMinutes(DateTime.MinValue, DbFunctions.DiffMinutes(DateTime.MinValue, d.TimeStamp) / minuteInterval * minuteInterval))

.GroupBy(d => d.TimeStamp, minuteInterval * minuteInterval)

并且生成的SQL查询将是这样的(对于minuteInterval = 15):

SELECT 
    1 AS [C1], 
    [GroupBy1].[K1] AS [C2], 
    [GroupBy1].[A1] AS [C3], 
    [GroupBy1].[A2] AS [C4], 
    [GroupBy1].[A3] AS [C5], 
    [GroupBy1].[A4] AS [C6]
    FROM ( SELECT 
        [Project1].[K1] AS [K1], 
        MIN([Project1].[A1]) AS [A1], 
        MAX([Project1].[A2]) AS [A2], 
        AVG([Project1].[A3]) AS [A3], 
        STDEVP([Project1].[A4]) AS [A4]
        FROM ( SELECT 
            DATEADD (minute, (DATEDIFF (minute, convert(datetime2, '0001-01-01 00:00:00.0000000', 121), [Project1].[TimeStamp])) / 225, convert(datetime2, '0001-01-01 00:00:00.0000000', 121)) AS [K1], 
            [Project1].[C1] AS [A1], 
            [Project1].[C1] AS [A2], 
            [Project1].[C1] AS [A3], 
            [Project1].[C1] AS [A4]
            FROM ( SELECT 
                [Extent1].[TimeStamp] AS [TimeStamp], 
                [Extent1].[DCCurrent] / [Extent2].[CurrentMPP] AS [C1]
                FROM    [dbo].[StringDatas] AS [Extent1]
                INNER JOIN [dbo].[DCStrings] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
                INNER JOIN [dbo].[DCDistributionBoxes] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
                INNER JOIN [dbo].[DataLoggers] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
                WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)
            )  AS [Project1]
        )  AS [Project1]
        GROUP BY [K1]
    )  AS [GroupBy1]

正如您所看到的,我们成功地删除了一些查询参数。这会有帮助吗?好吧,与任何数据库查询调优一样,它可能会也可能不会。你需要试试看。

答案 2 :(得分:1)

数据库引擎根据每个查询的调用方式确定每个查询的计划。在您的EF Linq查询的情况下,计划的准备方式是每个输入参数都被视为未知(因为您不知道会有什么进入)。在您的实际查询中,您将所有参数作为查询的一部分,因此它将在与参数化计划不同的计划下运行。我立刻看到的一件受影响的作品是

  

...(@ p__linq__0 IS NULL)..

这是假的,因为p_linq_0 = 20827并且是非NULL,因此WHERE的前半部分开始时为FALSE,不再需要查看。在LINQ查询的情况下,数据库不知道会发生什么,所以无论如何都要评估所有内容。

您需要查看是否可以使用索引或其他技术来加快运行速度。

答案 3 :(得分:1)

当EF运行查询时,它将其包装并使用sp_executesql运行它,这意味着执行计划将缓存在存储过程执行计划缓存中。由于原始sql语句与SP版本如何构建其执行计划的差异(参数嗅探等),两者可能不同。

当运行EF(sp包装)版本时,SQL服务器很可能使用更通用的执行计划,该计划涵盖的范围更广,而不是实际传入的值。

那就是说,为了减少SQL服务器尝试某些东西的机会&#34;搞笑&#34;使用散列连接等,我要做的第一件事是:

1)索引where子句和连接中使用的列

create index ix_DataLogger_ProjectID on DataLogger (ProjectID);
create index ix_DCDistributionBox_DataLoggerID on DCDistributionBox (DataLoggerID);
create index ix_DCString_DCDistributionBoxID on DCString (DCDistributionBoxID);

2)在Linq查询中进行显式连接以消除或ProductID为空部分