我继承了一些有趣的SQL,并试图弄清楚如何消除重复ID的行。我们的索引以某种柱状格式存储,然后我们将所有行转换为一个,其值为不同的列。
以下示例返回三行唯一数据,但ID是重复的。我只需要两行具有唯一ID(以及随之而来的其他列)。我知道我会丢失一些数据,但我只需要每个ID匹配一行(第一个,最顶层,最旧,最新,无论如何)。
我尝试过使用DISTINCT,GROUP BY和ROW_NUMBER,但我一直在弄错语法,或者在错误的位置使用它们。
我也可以完全以可重用的方式重写查询,因为我目前必须动态生成(cardtypes和cardindexes是用户定义的)并且希望能够创建存储过程。提前谢谢!
declare @cardtypes table ([ID] int, [Name] nvarchar(50))
declare @cards table ([ID] int, [CardTypeID] int, [Name] nvarchar(50))
declare @cardindexes table ([ID] int, [CardID] int, [IndexType] int, [StringVal] nvarchar(255), [DateVal] datetime)
INSERT INTO @cardtypes VALUES (1, 'Funny Cards')
INSERT INTO @cardtypes VALUES (2, 'Sad Cards')
INSERT INTO @cards VALUES (1, 1, 'Bunnies')
INSERT INTO @cards VALUES (2, 1, 'Dogs')
INSERT INTO @cards VALUES (3, 1, 'Cat')
INSERT INTO @cards VALUES (4, 1, 'Cat2')
INSERT INTO @cardindexes VALUES (1, 1, 1, 'Bunnies', null)
INSERT INTO @cardindexes VALUES (2, 1, 1, 'playing', null)
INSERT INTO @cardindexes VALUES (3, 1, 2, null, '2014-09-21')
INSERT INTO @cardindexes VALUES (4, 2, 1, 'Dogs', null)
INSERT INTO @cardindexes VALUES (5, 2, 1, 'playing', null)
INSERT INTO @cardindexes VALUES (6, 2, 1, 'poker', null)
INSERT INTO @cardindexes VALUES (7, 2, 2, null, '2014-09-22')
SELECT TOP(100)
[ID] = c.[ID],
[Name] = c.[Name],
[Keyword] = [colKeyword].[StringVal],
[DateAdded] = [colDateAdded].[DateVal]
FROM @cards AS c
LEFT JOIN @cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN @cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
ORDER BY [DateAdded]
编辑:
虽然两种解决方案都有效,但我最终使用了@popovitsj的MAX()解决方案,因为它更容易实现。来自多行的数据问题对我来说并不是真正的因素,因为所有行基本上都是同一记录的一部分。我很可能会根据我的需要使用这两种解决方案。
这是我更新的查询(因为它与答案完全不符):
SELECT TOP(100)
[ID] = c.[ID],
[Name] = MAX(c.[Name]),
[Keyword] = MAX([colKeyword].[StringVal]),
[DateAdded] = MAX([colDateAdded].[DateVal])
FROM @cards AS c
LEFT JOIN @cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN @cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
GROUP BY c.ID
ORDER BY [DateAdded]
答案 0 :(得分:2)
使用行号窗口函数和CTE可以很好地完成这项工作。例如:
;With preResult AS (
SELECT TOP(100)
[ID] = c.[ID],
[Name] = c.[Name],
[Keyword] = [colKeyword].[StringVal],
[DateAdded] = [colDateAdded].[DateVal],
ROW_NUMBER()OVER(PARTITION BY c.ID ORDER BY [colDateAdded].[DateVal]) rn
FROM @cards AS c
LEFT JOIN @cardindexes AS [colKeyword] ON [colKeyword].[CardID] = c.ID AND [colKeyword].[IndexType] = 1
LEFT JOIN @cardindexes AS [colDateAdded] ON [colDateAdded].[CardID] = c.ID AND [colDateAdded].[IndexType] = 2
WHERE [colKeyword].[StringVal] LIKE 'p%' AND c.[CardTypeID] = 1
ORDER BY [DateAdded]
)
SELECT * from preResult WHERE rn = 1
答案 1 :(得分:2)
您可以使用MAX或MIN来“决定”重复行中其他列的显示内容。
SELECT ID, MAX(Name), MAX(Keyword), MAX(DateAdded)
(...)
GROUP BY ID;