获取重复的信息

时间:2017-12-27 09:32:59

标签: sql sql-server duplicates

编辑作者:错误地描述了这个问题。 Here's the rephrased one

我已经继承了数据库,但我在构建有效的SQL查询时遇到了问题。

假设这是数据:

[Table]

| Id    | DisplayId     | Sequence  | Type      | Description   |
|----   |-----------    |---------- |-----------| -----------   |
| 1     | 12345         | 0         | 16        | Random        |
| 2     | 12345         | 0         | 2         | Random 2      |
| 3     | AB123         | 0         | 1         | Random 3      |
| 4     | 12345         | 1         | 16        | Random 4      |
| 5     | 12345         | 1         | 2         | Random 5      |
| 6     | XX45          | 0         | 5         | Random 6      |
| 7     | 12345         | 2         | 16        | Random 7      |
| 8     | 12345         | 2         | 2         | Random 8      |
| 9     | XX45          | 1         | 5         | Random 9      |
| 10    | XX45          | 2         | 5         | Random 10     |
| 11    | XX45          | 2         | 12        | Random 11     |
| 12    | 12345         | 3         | 16        | Random 12     |


[Type]

| Id    | State     |
|----   |-----------|
| 1     | 'ABC'     |
| 2     | '456'     |
| 5     | 'XYZ'     |
| 12    | 'XYZ'     |
| 16    | '456'     |

Type列是引用Type表的外键。 现在,当比较DisplayIdType.State时,我需要选择重复的行,然后只显示每个Sequence集的最高DisplayId / Type.State 。此外,Id列应用于加入其他数据(例如OtherTable.Title)。

因此,对于上面显示的数据,这将是预期的结果:

| Id    | DisplayId     | Sequence  | Type      | Description   | OtherTable.Title  |
|----   |-----------    |---------- |-----------|-------------  |------------------ |
| 8     | 12345         | 2         | 2         | Random 8      | Title 8           |
| 10    | XX45          | 2         | 5         | Random 10     | Title 10          |
| 11    | XX45          | 2         | 12        | Random 11     | Title 11          |
| 12    | 12345         | 3         | 16        | Random 12     | Title 12          |

我设法让比较和最高序列选择工作以获得具有重复项的DisplayId/Type的明确列表,但是一旦我再次插入Id列以显示其他数据,所有这些都搞砸了...

SELECT
    P.DisplayId, P.Type
FROM
    Table P
INNER JOIN
    (SELECT DisplayId, MAX(Sequence) AS Seq FROM Table GROUP BY DisplayId) HighSeq ON P.DisplayId = HighSeq.DisplayId AND P.Sequence = HighSeq.Seq
GROUP BY
    P.DisplayId, P.Type
HAVING
    COUNT(*) > 1

我渴望了解你的见解......

4 个答案:

答案 0 :(得分:0)

使用行号

;WITH CTE
AS
(
    SELECT
        RN = ROW_NUMBER() OVER(PARTITION BY DisplayId,[Sequence] ORDER BY DisplayId,[Sequence]),
        Id ,
        DisplayId,
        [Sequence],
        [Type],
        [Description]
        FROM YourTable
)
SELECT
    *
    FROM CTE
        INNER JOIN YourTable2 YT2
            ON CTE.ID = YT2.ID
        WHERE CTE.RN > 1

您可以将其他表与CTE一起加入,就像您在普通表中使用它一样

答案 1 :(得分:0)

你可以使用它。

SELECT T.*, OT.Title 
    FROM (
    SELECT M.*, Type.State, 
        RN = ROW_NUMBER() OVER (PARTITION BY DisplayId, M.[Type], Type.State ORDER BY Sequence DESC ),
        CNT = COUNT(M.Id) OVER (PARTITION BY DisplayId, Type.State )
    FROM MyTable M
    INNER JOIN Type Type ON M.[Type] = Type.Id
) T 
INNER JOIN OtherTable OT ON T.Id = OT.ID
WHERE 
    T.RN = 1
    AND T.CNT > 1
ORDER BY Id

答案 2 :(得分:0)

你可以试试这个:

select ID,a.displayid,sequence,type, table2.Title from
(select ID,displayid,sequence,type, Row_number() over (partition by displayid,Type order by sequence desc) rn
from table1) a inner join table2 on a.id = table2.id and a.rn=1

答案 3 :(得分:0)

这会返回您的预期结果,但类型12除外,因为它不是重复的。

;
WITH base
AS (SELECT
  DisplayId,
  [sequence],
  [Type],
  [Description],
  [State],
  High_sequence = MAX([Sequence]) OVER (ORDER BY [type]),
  dups = COUNT(*) OVER (PARTITION BY [type] ORDER BY [type])
FROM [Type] TY
INNER JOIN [Table] TBL
  ON TY.Id = TBL.Type)
SELECT
  t.Id,
  t.DisplayId,
  t.[sequence],
  t.[Type],
  t.[Description],
  [State]
FROM base b
INNER JOIN [Table] t
  ON b.Sequence = t.Sequence
  AND b.Type = t.Type
  AND b.DisplayId = t.DisplayId
WHERE b.[sequence] = High_sequence
AND dups > 1;