Question

我有一个关于linq-to-sql的奇怪问题，我真的试过搜索它。我正在设计一个sql数据库，最近刚尝试从中检索一个对象。

问题在于多个连接。我的所有表都使用identity-columns作为主键。

Db设计如下：

MasterTable：Id（主键，标识列，int），MasterColumn1（nvarchar（50））

Slave1：Id（主键，标识列，int），MasterId（int，主键 - ＆gt; MasterTable Id），SlaveCol1

Slave2：Id（主键，标识列，int），MasterId（int，主键 - ＆gt; MasterTable Id），SlaveColumn2

使用的代码：

var db = new TestDbDataContext() { Log = Console.Out };
var res = from f in db.MasterTables
          where f.MasterColumn1 == "wtf"
          select new
                     {
                         f.Id, 
                         SlaveCols1 = f.Slave1s.Select(s => s.SlaveCol1),
                         SlaveCols2 = f.Slave2s.Select(s => s.SlaveColumn2)
                     };
foreach (var re in res)
{
    Console.Out.WriteLine(
        re.Id + " "
      + string.Join(", ", re.SlaveCols1.ToArray()) + " "
      + string.Join(", ", re.SlaveCols2.ToArray())
    );
}

日志是：

SELECT [t0].[Id], [t1].[SlaveCol1], (
   SELECT COUNT(*)
   FROM [FR].[Slave1] AS [t2]
   WHERE [t2].[MasterId] = [t0].[Id]
   ) AS [value]
FROM [FR].[MasterTable] AS [t0]
LEFT OUTER JOIN [FR].[Slave1] AS [t1] ON [t1].[MasterId] = [t0].[Id]
WHERE [t0].[MasterColumn1] = @p0
ORDER BY [t0].[Id], [t1].[Id]
-- @p0: Input NVarChar (Size = 3; Prec = 0; Scale = 0) [wtf]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.5420
SELECT [t0].[SlaveColumn2]
   FROM [FR].[Slave2] AS [t0]
   WHERE [t0].[MasterId] = @x1
-- @x1: Input Int (Size = 0; Prec = 0; Scale = 0) [1]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.5420
1 SlaveCol1Wtf SlaveCol2Wtf

为什么哦为什么不做两个外连接呢？我真的非常关心这个，因为我有一个更大的数据库，有许多表引用同一个表（都有一对多的关系），有20个选择往返数据库服务器是不是最优的！

正如我所说的那样。我可以通过使用如下的显式外连接来生成想要的结果：

var db = new TestDbDataContext() { Log = Console.Out };
var res = from f in db.MasterTables
          join s1 in db.Slave1s on f.Id equals s1.MasterId into s1Tbl
          from s1 in s1Tbl.DefaultIfEmpty()
          join s2 in db.Slave2s on f.Id equals s2.MasterId into s2Tbl
          from s2 in s2Tbl.DefaultIfEmpty()
          where f.MasterColumn1 == "wtf"
          select new { f.Id, s1.SlaveCol1, s2.SlaveColumn2 };
foreach (var re in res)
{
    Console.Out.WriteLine(re.Id + " " + re.SlaveCol1 + " " + re.SlaveColumn2);
}

但我想使用Linq-To-Sql提供的参考而不是手动连接！怎么样？

----------- edit -----------------

我也尝试过这样的预取：

using (new DbConnectionScope())
{
    var db = new TestDbDataContext() { Log = Console.Out };
    DataLoadOptions loadOptions = new DataLoadOptions();
    loadOptions.LoadWith<MasterTable>(c => c.Slave1s);
    loadOptions.LoadWith<MasterTable>(c => c.Slave2s);
    db.LoadOptions = loadOptions;

    var res = from f in db.MasterTables
              where f.MasterColumn1 == "wtf"
              select f;
    foreach (var re in res)
    {
        Console.Out.WriteLine(re.Id + " " + 
            string.Join(", ", re.Slave1s.Select(s => s.SlaveCol1).ToArray()) + " " + 
            string.Join(", ", re.Slave2s.Select(s => s.SlaveColumn2).ToArray()));
    }
}

相同的结果=（

Answer 1

至于“为什么”，Linq-to-SQL可能认为它通过避免多个外连接使你的查询更好。

假设您从主表中提取了20个条目，并且每个从属表在主表中每个条目有20个条目。您将通过外部连接在一次往返中拉出8000个条目，而不是每次400个往返的两次往返。有一点，两次往返更便宜。在这种特殊情况下可能不正确，但很有可能如果你以这种方式加入很多表格，并且如果你在每张表格中提取大量数据，那么它很容易就会缩小比例。

您可能还想查看LINQ to SQL可能使用多个结果集在单个往返中执行两个SELECT的可能性。在这种情况下，双语句方法可能比双外连接快得多。

更新

经过多一点测试后，很明显Jim Wooley的答案更接近正轨：显然Linq to SQL只是决定不急切地加载任何你指定的第一个属性。它也很奇怪，因为它也不是完全懒惰的。它作为初始评估查询的一部分，在单独的往返中加载每个属性。对我来说，这似乎是LINQ to SQL的一个非常重要的限制。

Answer 2

使用LoadOptions进行预取选项并遍历关联而不是显式连接，您处于正确的轨道上，但是由于您尝试从MasterTable执行多个1-M导航，因此您将有效地创建笛卡尔积之间的产品。 Slave1和Slave2记录。因此，LINQ to SQL忽略了您的加载选项，并为您的每个子项延迟加载子记录。

您可以通过删除第二个子加载选项来稍微优化它。生成的查询现在将返回一个请求返回您的MasterTable和Slave1s，但随后延迟加载每个Slave2s。如果您执行以下操作，您应该看到同样的事情：

var res = from f in db.MasterTables
          where f.MasterColun1 == "wtf"
          select new 
          {
             f.Id,
             Cols1 = f.Slave1s.Select(s => s.SlaveCol1).ToArray()
             Cols2 = f.Slave2s.Select(s => s.SlaveColumn2).ToArray()
          }

您应该看到MasterTables和Slave1s之间的左连接，然后延迟加载Slave2s以避免Slave1和Slave2之间的笛卡尔积在SQL的展平结果中。

Answer 3

最有可能的是，它正在进行初始查询，然后在第一个查询之后执行投影，然后触发下一组查询。

我认为您需要对这些已连接的表进行一些预加载。

请参阅以下链接：

LINQ: Prefetching data from a second table
http://www.west-wind.com/weblog/posts/2009/Oct/12/LINQ-to-SQL-Lazy-Loading-and-Prefetching（在“DataLoadOptions for Prefetching”标题下）
http://www.davidhayden.com/blog/dave/archive/2007/08/05/LINQToSQLLazyLoadingPropertiesSpecifyingPreFetchWhenNeededPerformance.aspx

Answer 4

尝试：

var res = from f in db.MasterTables
          where f.MasterColumn1 == "wtf"
          let s1 = f.Slave1s.Select(s => s.SlaveCol1)
          let s2 = f.Slave2s.Select(s => s.SlaveColumn2)
          select new {
                         f.Id, 
                         SlaveCols1 = s1,
                         SlaveCols2 = s2
                     };

Linq-to-sql不会产生多个外连接？

4 个答案:

更新