Question

我继承了一些有趣的SQL，并试图弄清楚如何消除重复ID的行。我们的索引以某种柱状格式存储，然后我们将所有行转换为一个，其值为不同的列。

以下示例返回三行唯一数据，但ID是重复的。我只需要两行具有唯一ID（以及随之而来的其他列）。我知道我会丢失一些数据，但我只需要每个ID匹配一行（第一个，最顶层，最旧，最新，无论如何）。

我尝试过使用DISTINCT，GROUP BY和ROW_NUMBER，但我一直在弄错语法，或者在错误的位置使用它们。

我也可以完全以可重用的方式重写查询，因为我目前必须动态生成（cardtypes和cardindexes是用户定义的）并且希望能够创建存储过程。提前谢谢！

declare @cardtypes table ([ID] int, [Name] nvarchar(50))
declare @cards table ([ID] int, [CardTypeID] int, [Name] nvarchar(50))
declare @cardindexes table ([ID] int, [CardID] int, [IndexType] int, [StringVal] nvarchar(255), [DateVal] datetime)

INSERT INTO @cardtypes VALUES (1, 'Funny Cards')
INSERT INTO @cardtypes VALUES (2, 'Sad Cards')

INSERT INTO @cards VALUES (1, 1, 'Bunnies')
INSERT INTO @cards VALUES (2, 1, 'Dogs')
INSERT INTO @cards VALUES (3, 1, 'Cat')
INSERT INTO @cards VALUES (4, 1, 'Cat2')

INSERT INTO @cardindexes VALUES (1, 1, 1, 'Bunnies', null)
INSERT INTO @cardindexes VALUES (2, 1, 1, 'playing', null)
INSERT INTO @cardindexes VALUES (3, 1, 2, null, '2014-09-21')
INSERT INTO @cardindexes VALUES (4, 2, 1, 'Dogs', null)
INSERT INTO @cardindexes VALUES (5, 2, 1, 'playing', null)
INSERT INTO @cardindexes VALUES (6, 2, 1, 'poker', null)
INSERT INTO @cardindexes VALUES (7, 2, 2, null, '2014-09-22')


SELECT TOP(100)
    [ID] = c.[ID],
    [Name] = c.[Name],
    [Keyword] = [colKeyword].[StringVal],
    [DateAdded] = [colDateAdded].[DateVal]
FROM @cards AS c
LEFT JOIN @cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN @cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
ORDER BY [DateAdded]

编辑：

虽然两种解决方案都有效，但我最终使用了@popovitsj的MAX（）解决方案，因为它更容易实现。来自多行的数据问题对我来说并不是真正的因素，因为所有行基本上都是同一记录的一部分。我很可能会根据我的需要使用这两种解决方案。

这是我更新的查询（因为它与答案完全不符）：

SELECT TOP(100)
    [ID] = c.[ID],
    [Name] = MAX(c.[Name]),
    [Keyword] = MAX([colKeyword].[StringVal]),
    [DateAdded] = MAX([colDateAdded].[DateVal])
FROM @cards AS c
LEFT JOIN @cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN @cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
GROUP BY c.ID
ORDER BY [DateAdded]

Answer 1

使用行号窗口函数和CTE可以很好地完成这项工作。例如：

;With preResult AS (
SELECT TOP(100)
    [ID] = c.[ID],
    [Name] = c.[Name],
    [Keyword] = [colKeyword].[StringVal],
    [DateAdded] = [colDateAdded].[DateVal],
    ROW_NUMBER()OVER(PARTITION BY c.ID ORDER BY [colDateAdded].[DateVal]) rn
FROM @cards AS c
LEFT JOIN @cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN @cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
ORDER BY [DateAdded]
)

SELECT * from preResult WHERE rn = 1

Answer 2

您可以使用MAX或MIN来“决定”重复行中其他列的显示内容。

SELECT ID, MAX(Name), MAX(Keyword), MAX(DateAdded)
(...)
GROUP BY ID;

SQL仅从LEFT JOIN返回不同的ID

2 个答案: