在Linq到实体中使用COUNT生成有效的LEFT JOIN

时间:2016-07-16 11:16:45

标签: sql linq linq-to-entities

我有这个架构:

Lists ( ListId, Name, DateCreated, ... )
ListItems( ListId, Text, Foo, Baz, Qux, ... )

我有一个IQueryable<List>代表另一个Linq查询,它返回一些List个实体。

我希望JOIN使用ListItems的一些汇总数据,但这证明很难,因为Linq生成效率低下的SQL - 但我也希望使查询可以组合。

这与我希望Linq生成的SQL类似:

SELECT
    *
FROM
(
    -- This part represents the IQueryable:

    SELECT
        ListId,
        Name,
        ...
    FROM
        Lists
    ORDER BY
        Lists.DateCreated
    OFFSET
        0 ROWS FETCH NEXT 25 ROWS ONLY -- Linq's .Skip(0).Take(25)
) AS ListsResult
LEFT JOIN
(
    -- This is the aggregate data query I want Linq to generate:
    SELECT
        ListId,
        COUNT(1) AS [Count],
        COUNT( CASE WHEN Foo = 'bar' THEN 1 ELSE NULL END ) AS CountFooBar,
        COUNT( CASE WHEN Baz > 5     THEN 1 ELSE NULL END ) AS CountBaz5
    FROM
        ListItems
    WHERE
        Qux IS NOT NULL
    GROUP BY
        ListId
) AS ItemsStats ON ListResults.ListId = ItemsStats.ListId

这是我的Linq - 我更喜欢扩展方法语法:

IQueryable lists = GetLists( 0, 25 );

var stats = this.dbContext.ListItems
    .Where( (ListItem li) => li.Qux != null )
    .GroupBy( (ListItem li) => li.ListId )
    .Select( grp => new
    {
        grp.Key,
        Count       = grp.Count(),
        CountFooBar = grp.Count( (ListItem li) => li.Foo == "bar" )
        CountBaz5   = grp.Count( (ListItem li) => li.Baz  > 5 )
    } )

return lists
    .Join(
        inner: stats,
        outerKeySelector: (List l) => l.ListId,
        innerKeySelector: grp => grp.Key,
        resultSelector: (list, itemStats) => new { list, itemsStats }
    )

然而,这会生成如下所示的SQL(此查询显示我的真实表和列名称,这比我之前发布的模式稍微复杂一点:)

SELECT 
    [Project13].[C2] AS [C1], 
    [Project13].[ListId] AS [ListId], 
    [Project13].[C1] AS [C2], 
    [Project13].[C3] AS [C3], 
    [Project13].[C4] AS [C4], 
    [Project13].[C5] AS [C5], 
    [Project13].[C6] AS [C6], 
    [Project13].[C7] AS [C7]
    FROM ( SELECT 
        [Project11].[C1] AS [C1], 
        [Project11].[ListId] AS [ListId], 
        [Project11].[C2] AS [C2], 
        [Project11].[C3] AS [C3], 
        [Project11].[C4] AS [C4], 
        [Project11].[C5] AS [C5], 
        [Project11].[C6] AS [C6], 
        (SELECT 
            COUNT(1) AS [A1]
            FROM   (SELECT [Project12].[ListId] AS [ListId]
                FROM ( SELECT 
                    [Extent11].[ListId] AS [ListId], 
                    [Extent11].[Created] AS [Created]
                    FROM [dbo].[Lists] AS [Extent11]
                    WHERE ([Extent11].[TenantId] = 8) AND ([Extent11].[BlarghId] = 8)
                )  AS [Project12]
                ORDER BY [Project12].[Created] DESC
                OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit6]
            INNER JOIN [dbo].[ListItems] AS [Extent12] ON ([Limit6].[TenantId] = [Extent12].[TenantId]) AND ([Limit6].[BlarghId] = [Extent12].[BlarghId]) AND ([Limit6].[ListId] = [Extent12].[ListId])
            WHERE (([Extent12].[Baz] > 0) OR ((LEN([Extent12].[Notes])) > 0) OR ((LEN([Extent12].[Value])) > 0)) AND ([Project11].[TenantId] = [Extent12].[TenantId]) AND ([Project11].[BlarghId] = [Extent12].[BlarghId]) AND ([Project11].[ListId] = [Extent12].[ListId]) AND ([Extent12].[RecommendationRevision] IS NOT NULL)) AS [C7]
        FROM ( SELECT 
            [Project9].[C1] AS [C1], 
            [Project9].[TenantId] AS [TenantId], 
            [Project9].[BlarghId] AS [BlarghId], 
            [Project9].[ListId] AS [ListId], 
            [Project9].[C2] AS [C2], 
            [Project9].[C3] AS [C3], 
            [Project9].[C4] AS [C4], 
            [Project9].[C5] AS [C5], 
            (SELECT 
                COUNT(1) AS [A1]
                FROM   (SELECT [Project10].[TenantId] AS [TenantId], [Project10].[BlarghId] AS [BlarghId], [Project10].[ListId] AS [ListId]
                    FROM ( SELECT 
                        [Extent9].[TenantId] AS [TenantId], 
                        [Extent9].[BlarghId] AS [BlarghId], 
                        [Extent9].[ListId] AS [ListId], 
                        [Extent9].[Created] AS [Created]
                        FROM [dbo].[Lists] AS [Extent9]
                        WHERE ([Extent9].[TenantId] = 8) AND ([Extent9].[BlarghId] = 8)
                    )  AS [Project10]
                    ORDER BY [Project10].[Created] DESC
                    OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit5]
                INNER JOIN [dbo].[ListItems] AS [Extent10] ON ([Limit5].[TenantId] = [Extent10].[TenantId]) AND ([Limit5].[BlarghId] = [Extent10].[BlarghId]) AND ([Limit5].[ListId] = [Extent10].[ListId])
                WHERE (([Extent10].[Baz] > 0) OR ((LEN([Extent10].[Notes])) > 0) OR ((LEN([Extent10].[Value])) > 0)) AND ([Project9].[TenantId] = [Extent10].[TenantId]) AND ([Project9].[BlarghId] = [Extent10].[BlarghId]) AND ([Project9].[ListId] = [Extent10].[ListId]) AND (3 = [Extent10].[Baz])) AS [C6]
            FROM ( SELECT 
                [Project7].[C1] AS [C1], 
                [Project7].[TenantId] AS [TenantId], 
                [Project7].[BlarghId] AS [BlarghId], 
                [Project7].[ListId] AS [ListId], 
                [Project7].[C2] AS [C2], 
                [Project7].[C3] AS [C3], 
                [Project7].[C4] AS [C4], 
                (SELECT 
                    COUNT(1) AS [A1]
                    FROM   (SELECT [Project8].[TenantId] AS [TenantId], [Project8].[BlarghId] AS [BlarghId], [Project8].[ListId] AS [ListId]
                        FROM ( SELECT 
                            [Extent7].[TenantId] AS [TenantId], 
                            [Extent7].[BlarghId] AS [BlarghId], 
                            [Extent7].[ListId] AS [ListId], 
                            [Extent7].[Created] AS [Created]
                            FROM [dbo].[Lists] AS [Extent7]
                            WHERE ([Extent7].[TenantId] = 8) AND ([Extent7].[BlarghId] = 8)
                        )  AS [Project8]
                        ORDER BY [Project8].[Created] DESC
                        OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit4]
                    INNER JOIN [dbo].[ListItems] AS [Extent8] ON ([Limit4].[TenantId] = [Extent8].[TenantId]) AND ([Limit4].[BlarghId] = [Extent8].[BlarghId]) AND ([Limit4].[ListId] = [Extent8].[ListId])
                    WHERE (([Extent8].[Baz] > 0) OR ((LEN([Extent8].[Notes])) > 0) OR ((LEN([Extent8].[Value])) > 0)) AND ([Project7].[TenantId] = [Extent8].[TenantId]) AND ([Project7].[BlarghId] = [Extent8].[BlarghId]) AND ([Project7].[ListId] = [Extent8].[ListId]) AND (2 = [Extent8].[Baz])) AS [C5]
                FROM ( SELECT 
                    [Project5].[C1] AS [C1], 
                    [Project5].[TenantId] AS [TenantId], 
                    [Project5].[BlarghId] AS [BlarghId], 
                    [Project5].[ListId] AS [ListId], 
                    [Project5].[C2] AS [C2], 
                    [Project5].[C3] AS [C3], 
                    (SELECT 
                        COUNT(1) AS [A1]
                        FROM   (SELECT [Project6].[TenantId] AS [TenantId], [Project6].[BlarghId] AS [BlarghId], [Project6].[ListId] AS [ListId]
                            FROM ( SELECT 
                                [Extent5].[TenantId] AS [TenantId], 
                                [Extent5].[BlarghId] AS [BlarghId], 
                                [Extent5].[ListId] AS [ListId], 
                                [Extent5].[Created] AS [Created]
                                FROM [dbo].[Lists] AS [Extent5]
                                WHERE ([Extent5].[TenantId] = 8) AND ([Extent5].[BlarghId] = 8)
                            )  AS [Project6]
                            ORDER BY [Project6].[Created] DESC
                            OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit3]
                        INNER JOIN [dbo].[ListItems] AS [Extent6] ON ([Limit3].[TenantId] = [Extent6].[TenantId]) AND ([Limit3].[BlarghId] = [Extent6].[BlarghId]) AND ([Limit3].[ListId] = [Extent6].[ListId])
                        WHERE (([Extent6].[Baz] > 0) OR ((LEN([Extent6].[Notes])) > 0) OR ((LEN([Extent6].[Value])) > 0)) AND ([Project5].[TenantId] = [Extent6].[TenantId]) AND ([Project5].[BlarghId] = [Extent6].[BlarghId]) AND ([Project5].[ListId] = [Extent6].[ListId]) AND (1 = [Extent6].[Baz])) AS [C4]
                    FROM ( SELECT 
                        [Project3].[C1] AS [C1], 
                        [Project3].[TenantId] AS [TenantId], 
                        [Project3].[BlarghId] AS [BlarghId], 
                        [Project3].[ListId] AS [ListId], 
                        [Project3].[C2] AS [C2], 
                        (SELECT 
                            COUNT(1) AS [A1]
                            FROM   (SELECT [Project4].[TenantId] AS [TenantId], [Project4].[BlarghId] AS [BlarghId], [Project4].[ListId] AS [ListId]
                                FROM ( SELECT 
                                    [Extent3].[TenantId] AS [TenantId], 
                                    [Extent3].[BlarghId] AS [BlarghId], 
                                    [Extent3].[ListId] AS [ListId], 
                                    [Extent3].[Created] AS [Created]
                                    FROM [dbo].[Lists] AS [Extent3]
                                    WHERE ([Extent3].[TenantId] = 8) AND ([Extent3].[BlarghId] = 8)
                                )  AS [Project4]
                                ORDER BY [Project4].[Created] DESC
                                OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit2]
                            INNER JOIN [dbo].[ListItems] AS [Extent4] ON ([Limit2].[TenantId] = [Extent4].[TenantId]) AND ([Limit2].[BlarghId] = [Extent4].[BlarghId]) AND ([Limit2].[ListId] = [Extent4].[ListId])
                            WHERE (([Extent4].[Baz] > 0) OR ((LEN([Extent4].[Notes])) > 0) OR ((LEN([Extent4].[Value])) > 0)) AND ([Project3].[TenantId] = [Extent4].[TenantId]) AND ([Project3].[BlarghId] = [Extent4].[BlarghId]) AND ([Project3].[ListId] = [Extent4].[ListId]) AND ([Extent4].[Foo] = 1)) AS [C3]
                        FROM ( SELECT 
                            [GroupBy1].[A1] AS [C1], 
                            [GroupBy1].[K1] AS [TenantId], 
                            [GroupBy1].[K2] AS [BlarghId], 
                            [GroupBy1].[K3] AS [ListId], 
                            [GroupBy1].[K4] AS [C2]
                            FROM ( SELECT 
                                [Project2].[K1] AS [K1], 
                                [Project2].[K2] AS [K2], 
                                [Project2].[K3] AS [K3], 
                                [Project2].[K4] AS [K4], 
                                COUNT([Project2].[A1]) AS [A1]
                                FROM ( SELECT 
                                    [Project2].[TenantId] AS [K1], 
                                    [Project2].[BlarghId] AS [K2], 
                                    [Project2].[ListId] AS [K3], 
                                    1 AS [K4], 
                                    1 AS [A1]
                                    FROM ( SELECT 
                                        [Extent2].[ListId] AS [ListId]
                                        FROM   (SELECT [Project1].[ListId] AS [ListId]
                                            FROM ( SELECT 
                                                [Extent1].[ListId] AS [ListId], 
                                                [Extent1].[Created] AS [Created]
                                                FROM [dbo].[Lists] AS [Extent1]
                                                WHERE ([Extent1].[TenantId] = 8) AND ([Extent1].[BlarghId] = 8)
                                            )  AS [Project1]
                                            ORDER BY [Project1].[Created] DESC
                                            OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit1]
                                        INNER JOIN [dbo].[ListItems] AS [Extent2] ON (([Limit1].[ListId] = [Extent2].[ListId])
                                        WHERE ([Extent2].[Baz] > 0) OR ((LEN([Extent2].[Notes])) > 0) OR ((LEN([Extent2].[Value])) > 0)
                                    )  AS [Project2]
                                )  AS [Project2]
                                GROUP BY [K1], [K2], [K3], [K4]
                            )  AS [GroupBy1]
                        )  AS [Project3]
                    )  AS [Project5]
                )  AS [Project7]
            )  AS [Project9]
        )  AS [Project11]
    )  AS [Project13]

它根本不会将COUNT()语句组合在一起,而是将COUNT谓词移到单独的WHERE子句中。还要注意重复的分页子查询(使用OFFSET 0 ROW FETCH NEXT 25),而我的手写查询只执行一次。

2 个答案:

答案 0 :(得分:1)

这是我的半解决方法:

我意识到最好的短期解决方案是在数据库中使用SQL(作为UDF FUNCTIONVIEW),这意味着某些数据代码因此必须在数据库(而不是将数据库用作“哑商店”)。

我首先创建了一个表值UDF,它接受一个表值参数,推理允许合成,代价是需要生成输入参数表(ListId值的数组来自分页查询)。但是,在这样做的过程中,我意识到Linq-to-Entities(版本6)还不支持Function Imports中的表值参数。

然后我推断一种更好的方法是将COUNT操作移动到VIEW(代表我手写查询中LEFT JOIN的一半),并且然后我可以让Linq加入现有的IQueryable,从而保留可组合性并生成有效的运行时查询(实际上,当我运行它时,查询需要34ms才能根据SQL Server Profiler执行,而旧的Linq生成的低效查询耗时830毫秒。

以下是我使用的内容:

CREATE VIEW ListItemStatistics AS

SELECT
    ListId,

    COUNT(*) AS [CountAll],
    COUNT( CASE WHEN ... ) AS Count...

FROM
    ListItems
WHERE
    Foo = 'bar'
GROUP BY
     ListId

然后从Linq内部:

IQueryable lists = GetListsQuery( 0, 25 );

var listsWithItemsStats = lists.
    .Join(
        inner: this.dbContext.ListItemStatistics,
        outerKeySelector: list => list.ListId,
        innerKeySelector: row => row.ListId,
        resultSelector: (list,row) => new { list, row }
    );

但是因为这确实使用了数据库端逻辑(在VIEW中),所以它并不理想。

答案 1 :(得分:1)

根据我的经验,当您在案例中使用应用过滤的聚合函数(例如Count(predicate)时,EF会生成此类查询。如果您使用条件求和(Count(condition))替换Sum(condition ? 1 : 0)构造,您将获得更好的SQL查询:

var stats = db.ListItems
    .Where(li => li.Qux != null)
    .GroupBy(li => li.ListId)
    .Select(grp => new
    {
        grp.Key,
        Count = grp.Count(),
        CountFooBar = grp.Sum(li => li.Foo == "bar" ? 1 : 0),
        CountBaz5 = grp.Sum(li => li.Baz > 5 ? 1 : 0)
    });

其他部分保持不变。在mod之前我得到了与发布的类似的SQL,这是我在这个小mod之后得到的:

SELECT 
    [Limit1].[ListId] AS [ListId], 
    [Limit1].[Name] AS [Name], 
    [Limit1].[DateCreated] AS [DateCreated], 
    [GroupBy1].[K1] AS [ListId1], 
    [GroupBy1].[A1] AS [C1], 
    [GroupBy1].[A2] AS [C2], 
    [GroupBy1].[A3] AS [C3]
    FROM   (SELECT [Extent1].[ListId] AS [ListId], [Extent1].[Name] AS [Name], [Extent1].[DateCreated] AS [DateCreated]
        FROM [dbo].[List] AS [Extent1]
        ORDER BY [Extent1].[DateCreated] ASC
        OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY  ) AS [Limit1]
    INNER JOIN  (SELECT 
        [Filter1].[K1] AS [K1], 
        COUNT([Filter1].[A1]) AS [A1], 
        SUM([Filter1].[A2]) AS [A2], 
        SUM([Filter1].[A3]) AS [A3]
        FROM ( SELECT 
            [Extent2].[ListId] AS [K1], 
            1 AS [A1], 
            CASE WHEN (N'bar' = [Extent2].[Foo]) THEN 1 ELSE 0 END AS [A2], 
            CASE WHEN ([Extent2].[Baz] > 5) THEN 1 ELSE 0 END AS [A3]
            FROM [dbo].[ListItem] AS [Extent2]
            WHERE [Extent2].[Qux] IS NOT NULL
        )  AS [Filter1]
        GROUP BY [K1] ) AS [GroupBy1] ON [Limit1].[ListId] = [GroupBy1].[K1]

UPDATE:以上只是手动SQL查询的LINQ等价物(当您调整连接代码以生成LEFT OUTER JOIN时)。但是,考虑到主数据分页,OUTER APPLY SQL查询可能会表现得更好。您可以让LINQ to Entities生成这样类型的查询:

var listsWithItemsStats = lists
    .SelectMany(l => db.ListItems
        .Where(li => li.ListId == l.ListId && li.Qux != null)
        .GroupBy(li => li.ListId)
        .Select(grp => new
        {
            grp.Key,
            Count = grp.Count(),
            CountFooBar = grp.Sum(li => li.Foo == "bar" ? 1 : 0),
            CountBaz5 = grp.Sum(li => li.Baz > 5 ? 1 : 0)
        })
        .DefaultIfEmpty(),
        (list, itemsStats) => new { list, itemsStats });