SQL仅选择最常见的结果

时间:2018-04-11 10:14:19

标签: sql-server query-optimization

我有一个包含ID和Items的表,有时关联的Item与其他与相同ID关联的Items有变化。我需要一个选择最常见项目的查询,并将其分配给该ID。

以下查询有效,但我希望对其进行优化以避免在最后加入两个单独的CTE,而是使用一个灵活的SELECT语句:

IF OBJECT_ID('tempdb..#Test') IS NOT NULL
    DROP TABLE #Test

CREATE TABLE #Test
(
    [ID] INT
    ,[Item] VARCHAR(20)
)

INSERT #Test
VALUES
(100, 'Apple'),
(100, 'Apple'),
(100, 'Apples'),
(200, 'Orange'),
(200, 'Orange'),
(200, 'Orange'),
(200, 'Oranges'),
(300, 'Grape');

WITH cteOne AS (SELECT
[ID]
,[Item]
,COUNT(*) [Count]
FROM #Test
GROUP BY [ID]
,[Item]
),
cteTwo AS (SELECT
[ID]
,MAX([Count]) [Max]
FROM cteOne
GROUP BY [ID])

SELECT
C1.[ID]
,C1.[Item]
FROM cteOne C1
INNER JOIN cteTwo C2 ON C2.[ID] = C1.[ID]
AND C2.[Max] = C1.[Count]
ORDER BY [ID]

感谢任何帮助!

2 个答案:

答案 0 :(得分:2)

您可以使用top 1 with ties

尝试row_number
select
    top 1 with ties [ID], [Item]
from (
    SELECT
        [ID], [Item], COUNT(*) [Count]
    FROM #Test
    GROUP BY [ID], [Item]
) t
order by row_number() over (partition by [ID] order by [Count] desc)

答案 1 :(得分:0)

这更好:

;WITH 
cteOne AS (
    SELECT [ID],[Item] ,COUNT(*) [Count]
    FROM #Test
    GROUP BY [ID],[Item]
),
cteTwoo as (
    select *, ROW_NUMBER() over (partition by id order by count) idx
    from cteOne
)
select ID, Item
from cteTwoo
where idx = 1