在linq-to-entities中使用参数的性能降低

时间:2014-07-01 11:13:47

标签: linq-to-entities

我遇到Linq2Entities生成的非常慢(几分钟!)查询的问题。 Linq查询如下所示:

        var i = new IncludeStrategy<DeviceData>();
        if (includeIOData)
        {
            i.Include<DeviceDataExtended>(dd => dd.Extended);
        }

        var data = from dd in GetAllData()
                   where dd.DeviceId == deviceId
                   select dd;

        data = data.Include(i);

        data = data.OrderByDescending(dd => dd.UtcTime);

        if (start.HasValue)
        {
            var utcStart = start.Value.ToUniversalTime();
            data = data.Where(dd => dd.UtcTime >= utcStart);
        }
        if (end.HasValue)
        {
            var utcEnd = end.Value.ToUniversalTime();
            data = data.Where(dd => dd.UtcTime <= utcEnd);
        }

        return data.Skip(0).Take(15);

为了测试查询,我检查了生成的SQL输出,并在SQL Management studio中直接尝试了 - 而且看,它只需要几秒钟。 linq生成的SQL输出如下所示:

declare @p__linq__0 int = 15386
declare @p__linq__1 datetime2(7) = convert(datetime2, '2014-05-01 00:00:00.0000000', 121)
declare @p__linq__2 datetime2(7) = convert(datetime2, '2014-06-01 00:00:00.0000000', 121)

SELECT TOP (15) [Project1].[C1] AS [C1], [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[IOData] AS [IOData]
FROM ( SELECT [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IOData] AS [IOData], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[C1] AS [C1], row_number() OVER (ORDER BY [Project1].[UtcTime] DESC) AS [row_number]
FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[DeviceId] AS [DeviceId], [Extent1].[PersonId] AS [PersonId], [Extent1].[VehicleId] AS [VehicleId], [Extent1].[TokenId] AS [TokenId], [Extent1].[Latitude] AS [Latitude], [Extent1].[Longitude] AS [Longitude], [Extent1].[UtcTime] AS [UtcTime], [Extent1].[Speed] AS [Speed], [Extent1].[Heading] AS [Heading], [Extent1].[Satellites] AS [Satellites], [Extent1].[IOData] AS [IOData], [Extent1].[IgnitionState] AS [IgnitionState], [Extent1].[UserInput] AS [UserInput], [Extent1].[CreateTimeUtc] AS [CreateTimeUtc], 1 AS [C1]FROM [dbo].[DeviceData] AS [Extent1]
        WHERE ([Extent1].[DeviceId] = @p__linq__0) AND ([Extent1].[UtcTime] >= @p__linq__1) AND ([Extent1].[UtcTime] <= @p__linq__2)
    )  AS [Project1]
)  AS [Project1]
WHERE [Project1].[row_number] > 0
ORDER BY [Project1].[UtcTime] DESC

有3个参数:@ p__linq__0,@ p__linq__1和@ p__linq__2分别代表deviceid,utcstart和utcend。

在多次尝试修改Linq查询后,我最终发现它必须归结为linq参数,因为如果我在Linq中创建完全相同的查询但用实际值替换参数 - IT工作!这也解释了为什么SQL Management Studio中的查询速度很快,因为我自然用实际值替换了Linq参数。

IQueryable<DeviceData> temp = from dd in ddRep.GetAll(i)
                              where dd.DeviceId == 15386 && dd.UtcTime >= new DateTime(2014, 05, 01) && dd.UtcTime <= new DateTime(2014, 06, 01)
                              orderby dd.UtcTime descending
                              select dd;

                return temp.Skip(0).Take(15);

生成的SQL输出现在如下所示:

SELECT TOP (15) [Project1].[C1] AS [C1], [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[IOData] AS [IOData]
FROM ( SELECT [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IOData] AS [IOData], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[C1] AS [C1], row_number() OVER (ORDER BY [Project1].[UtcTime] DESC) AS [row_number]
FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[DeviceId] AS [DeviceId], [Extent1].[PersonId] AS [PersonId], [Extent1].[VehicleId] AS [VehicleId], [Extent1].[TokenId] AS [TokenId], [Extent1].[Latitude] AS [Latitude], [Extent1].[Longitude] AS [Longitude], [Extent1].[UtcTime] AS [UtcTime], [Extent1].[Speed] AS [Speed], [Extent1].[Heading] AS [Heading], [Extent1].[Satellites] AS [Satellites], [Extent1].[IOData] AS [IOData], [Extent1].[IgnitionState] AS [IgnitionState], [Extent1].[UserInput] AS [UserInput], [Extent1].[CreateTimeUtc] AS [CreateTimeUtc], 1 AS [C1]FROM [dbo].[DeviceData] AS [Extent1] 
        WHERE (15386 = [Extent1].[DeviceId]) AND ([Extent1].[UtcTime] >= convert(datetime2, '2014-05-01 00:00:00.0000000', 121)) AND ([Extent1].[UtcTime] <= convert(datetime2, '2014-06-01 00:00:00.0000000', 121))
    )  AS [Project1]
)  AS [Project1]
WHERE [Project1].[row_number] > 0
ORDER BY [Project1].[UtcTime] DESC

两个查询之间的唯一区别是WHERE子句。

那么这里发生了什么?为什么在Linq查询中使用参数这么昂贵,并且它是否有任何解决方法(并且仍然坚持使用EF)?

修改
经过更多的调查,我发现它归结为utcEnd参数是否是可以为空的日期时间。如果 可以为空,则查询会产生CAST,并且性能良好 - utcStart无效。幸运的是,将它投射到可以为空的日期时间是非常容易的,但我仍然认为这会影响性能的原因是什么?查询命中DeviceId(asc)上的索引 - UtcTime(asc)。

编辑2:
在查看实际执行计划之后,我可以看到问题的原因肯定与表被分配的事实有关。错误的查询使聚簇索引在视图中查找并连接每个表(超过200个表!),好的查询只搜索并连接覆盖UtcStart和-End之间的时间段的表。

Bad query

Good query

那么为什么参数化的日期时间会导致这个可怕的执行计划呢?我现在意识到它确实是一个SQL Server问题,而不是EF问题,所以也许我应该把它移到dba.stackexchange.com上?

0 个答案:

没有答案