我正在使用.Net Core和Entity Framework Core开发API,它与Azure SQL数据库进行通信。
我发现了一个涉及左连接的EF查询的性能问题。令人惊讶的是,事实证明,必须将两个单独查询的结果转换为列表并加入C#,而不是简单地用一个查询的结果填充对象。我想我已经弄清楚发生了什么,但我想知道为什么。
原始代码:
Stopwatch sw = new Stopwatch();
sw.Start();
string[] values = new[] { val1, val2, val3 };
var qry1 = _logic.GetAllTable1Data();
qry1 = qry1.Include(a => a.Table2);
qry1 = qry1.Include(a => a.Table3);
qry1 = qry1.Where(a => values.Contains(a.Table2.Column));
var select = qry1.Select(a => new { a.Column1, a.Table3.Column2 });
var qry2 = _logic.GetAllTable4Data();
var join = select.GroupJoin(qry2, a => a.Column1, a => a.Column1,
(a, b) => new { Table1 = a, Table2 = b }).ToList();
sw.Stop();
return sw.Elapsed.TotalSeconds;
SQL生成:
SELECT
[a].[Column1] AS [Column10],
...
All 80+ columns from Table1
...
All Columns from Table4
...
[a.Table3].[Column2]
FROM [Standard_Case] AS [a]
LEFT JOIN [Table3] AS [a.Table3] ON [a].[Column1] = [a.Table3].[Column1]
INNER JOIN [Table2] AS [a.Table2] ON [a].[Column] = [a.Table2].[Column]
LEFT JOIN [Table4] AS [a0] ON [a].[Column1] = [a0].[Column1]
WHERE [a.Table2].[Column] IN (val1, val2, val3)
ORDER BY [Column10]
(execution time 488 ms)
所花费的时间:
2.9992557 s
修改后的代码:
Stopwatch sw = new Stopwatch();
sw.Start();
string[] values = new[] { val1, val2, val3 };
var qry1 = _logic.GetAllTable1Data();
qry1 = qry1.Include(a => a.Table2);
qry1 = qry1.Include(a => a.Table3);
qry1 = qry1.Where(a => values.Contains(a.Table2.Column));
var select = qry1.Select(a => new { a.Column1, a.Table3.Column2 }).ToList();
var qry2 = _logic.GetAllTable4Data().ToList();
var join = select.GroupJoin(qry2, a => a.Column1, a => a.Column1,
(a, b) => new { Table1 = a, Table2 = b }).ToList();
sw.Stop();
return sw.Elapsed.TotalSeconds;
SQL生成:
SELECT
[a].[Column1],
[a.Table3].[Column2]
FROM [Table1] AS [a]
LEFT JOIN [Table3] AS [a.Table3] ON [a].[Column1] = [a.Table3].[Column1]
INNER JOIN [Table2] AS [a.Table2] ON [a].[Column] = [a.Table2].[Column]
WHERE [a.Table2].[Column] IN (val1, val2, val3)
(execution time 8 ms)
SELECT
...
All Columns from Table4
...
FROM [Table4] AS [s]
(execution time 7 ms)
所花的时间
0.7477659 s
我认为原始代码需要更长时间的原因是它从数据库(> 80)中选择了许多我没有请求但不需要的列(查询基表中的每一列) 。这导致从查询中撤回更多数据。
有没有人知道为什么我的Select运算符没有得到尊重,因为它是在我单独运行查询或任何其他理论为什么我的第一个代码如此慢。第一个查询中也插入了ORDER
运算符 - 这有什么原因吗?