我遇到Linq2Entities生成的非常慢(几分钟!)查询的问题。 Linq查询如下所示:
var i = new IncludeStrategy<DeviceData>();
if (includeIOData)
{
i.Include<DeviceDataExtended>(dd => dd.Extended);
}
var data = from dd in GetAllData()
where dd.DeviceId == deviceId
select dd;
data = data.Include(i);
data = data.OrderByDescending(dd => dd.UtcTime);
if (start.HasValue)
{
var utcStart = start.Value.ToUniversalTime();
data = data.Where(dd => dd.UtcTime >= utcStart);
}
if (end.HasValue)
{
var utcEnd = end.Value.ToUniversalTime();
data = data.Where(dd => dd.UtcTime <= utcEnd);
}
return data.Skip(0).Take(15);
为了测试查询,我检查了生成的SQL输出,并在SQL Management studio中直接尝试了 - 而且看,它只需要几秒钟。 linq生成的SQL输出如下所示:
declare @p__linq__0 int = 15386
declare @p__linq__1 datetime2(7) = convert(datetime2, '2014-05-01 00:00:00.0000000', 121)
declare @p__linq__2 datetime2(7) = convert(datetime2, '2014-06-01 00:00:00.0000000', 121)
SELECT TOP (15) [Project1].[C1] AS [C1], [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[IOData] AS [IOData]
FROM ( SELECT [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IOData] AS [IOData], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[C1] AS [C1], row_number() OVER (ORDER BY [Project1].[UtcTime] DESC) AS [row_number]
FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[DeviceId] AS [DeviceId], [Extent1].[PersonId] AS [PersonId], [Extent1].[VehicleId] AS [VehicleId], [Extent1].[TokenId] AS [TokenId], [Extent1].[Latitude] AS [Latitude], [Extent1].[Longitude] AS [Longitude], [Extent1].[UtcTime] AS [UtcTime], [Extent1].[Speed] AS [Speed], [Extent1].[Heading] AS [Heading], [Extent1].[Satellites] AS [Satellites], [Extent1].[IOData] AS [IOData], [Extent1].[IgnitionState] AS [IgnitionState], [Extent1].[UserInput] AS [UserInput], [Extent1].[CreateTimeUtc] AS [CreateTimeUtc], 1 AS [C1]FROM [dbo].[DeviceData] AS [Extent1]
WHERE ([Extent1].[DeviceId] = @p__linq__0) AND ([Extent1].[UtcTime] >= @p__linq__1) AND ([Extent1].[UtcTime] <= @p__linq__2)
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 0
ORDER BY [Project1].[UtcTime] DESC
有3个参数:@ p__linq__0,@ p__linq__1和@ p__linq__2分别代表deviceid,utcstart和utcend。
在多次尝试修改Linq查询后,我最终发现它必须归结为linq参数,因为如果我在Linq中创建完全相同的查询但用实际值替换参数 - IT工作!这也解释了为什么SQL Management Studio中的查询速度很快,因为我自然用实际值替换了Linq参数。
IQueryable<DeviceData> temp = from dd in ddRep.GetAll(i)
where dd.DeviceId == 15386 && dd.UtcTime >= new DateTime(2014, 05, 01) && dd.UtcTime <= new DateTime(2014, 06, 01)
orderby dd.UtcTime descending
select dd;
return temp.Skip(0).Take(15);
生成的SQL输出现在如下所示:
SELECT TOP (15) [Project1].[C1] AS [C1], [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[IOData] AS [IOData]
FROM ( SELECT [Project1].[Id] AS [Id], [Project1].[DeviceId] AS [DeviceId], [Project1].[PersonId] AS [PersonId], [Project1].[VehicleId] AS [VehicleId], [Project1].[TokenId] AS [TokenId], [Project1].[Latitude] AS [Latitude], [Project1].[Longitude] AS [Longitude], [Project1].[UtcTime] AS [UtcTime], [Project1].[Speed] AS [Speed], [Project1].[Heading] AS [Heading], [Project1].[Satellites] AS [Satellites], [Project1].[IOData] AS [IOData], [Project1].[IgnitionState] AS [IgnitionState], [Project1].[UserInput] AS [UserInput], [Project1].[CreateTimeUtc] AS [CreateTimeUtc], [Project1].[C1] AS [C1], row_number() OVER (ORDER BY [Project1].[UtcTime] DESC) AS [row_number]
FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[DeviceId] AS [DeviceId], [Extent1].[PersonId] AS [PersonId], [Extent1].[VehicleId] AS [VehicleId], [Extent1].[TokenId] AS [TokenId], [Extent1].[Latitude] AS [Latitude], [Extent1].[Longitude] AS [Longitude], [Extent1].[UtcTime] AS [UtcTime], [Extent1].[Speed] AS [Speed], [Extent1].[Heading] AS [Heading], [Extent1].[Satellites] AS [Satellites], [Extent1].[IOData] AS [IOData], [Extent1].[IgnitionState] AS [IgnitionState], [Extent1].[UserInput] AS [UserInput], [Extent1].[CreateTimeUtc] AS [CreateTimeUtc], 1 AS [C1]FROM [dbo].[DeviceData] AS [Extent1]
WHERE (15386 = [Extent1].[DeviceId]) AND ([Extent1].[UtcTime] >= convert(datetime2, '2014-05-01 00:00:00.0000000', 121)) AND ([Extent1].[UtcTime] <= convert(datetime2, '2014-06-01 00:00:00.0000000', 121))
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 0
ORDER BY [Project1].[UtcTime] DESC
两个查询之间的唯一区别是WHERE子句。
那么这里发生了什么?为什么在Linq查询中使用参数这么昂贵,并且它是否有任何解决方法(并且仍然坚持使用EF)?
修改
经过更多的调查,我发现它归结为utcEnd参数是否是可以为空的日期时间。如果 可以为空,则查询会产生CAST,并且性能良好 - utcStart无效。幸运的是,将它投射到可以为空的日期时间是非常容易的,但我仍然认为这会影响性能的原因是什么?查询命中DeviceId(asc)上的索引 - UtcTime(asc)。
编辑2:
在查看实际执行计划之后,我可以看到问题的原因肯定与表被分配的事实有关。错误的查询使聚簇索引在视图中查找并连接每个表(超过200个表!),好的查询只搜索并连接覆盖UtcStart和-End之间的时间段的表。
那么为什么参数化的日期时间会导致这个可怕的执行计划呢?我现在意识到它确实是一个SQL Server问题,而不是EF问题,所以也许我应该把它移到dba.stackexchange.com上?