通过其他信息获取重复

时间:2017-12-27 11:34:01

标签: sql-server duplicates

我已经继承了数据库,但我在构建有效的SQL查询时遇到了问题。

假设这是数据:

[Products]

| Id    | DisplayId     | Version   | Company   | Description   |
|----   |-----------    |---------- |-----------| -----------   |
| 1     | 12345         | 0         | 16        | Random        |
| 2     | 12345         | 0         | 2         | Random 2      |
| 3     | AB123         | 0         | 1         | Random 3      |
| 4     | 12345         | 1         | 16        | Random 4      |
| 5     | 12345         | 1         | 2         | Random 5      |
| 6     | AB123         | 0         | 5         | Random 6      |
| 7     | 12345         | 2         | 16        | Random 7      |
| 8     | XX45          | 0         | 5         | Random 8      |
| 9     | XX45          | 0         | 7         | Random 9      |
| 10    | XX45          | 1         | 5         | Random 10     |
| 11    | XX45          | 1         | 7         | Random 11     |


[Companies]

| Id    | Code      |
|----   |-----------|
| 1     | 'ABC'     |
| 2     | '456'     |
| 5     | 'XYZ'     |
| 7     | 'XYZ'     |
| 16    | '456'     |

Version列是版本号。数字越大表示更新的版本。 Company列是引用Companies列上的Id表的外键。 还有一个名为ProductData的表格,ProductId列引用了Products.Id

现在我需要根据DisplayId和相应的Companies.Code查找重复项。应该加入ProductData表以显示标题(ProductData.Title),并且只有最新的表应包含在结果中。所以预期的结果是:

| Id    | DisplayId     | Version   | Company   | Description   | ProductData.Title |
|----   |-----------    |---------- |-----------|-------------  |------------------ |
| 5     | 12345         | 1         | 2         | Random 2      | Title 2           |
| 7     | 12345         | 2         | 16        | Random 7      | Title 7           |
| 10    | XX45          | 1         | 5         | Random 10     | Title 10          |
| 11    | XX45          | 1         | 7         | Random 11     | Title 11          |
  • 因为XX45有2个"条目":一个有公司5,一个有公司7,但两个公司共享相同的代码。
  • 因为12345有2个"条目":一个有公司2,一个有公司16,但两个公司共享相同的代码。请注意,两者的最新版本不同(公司16和条目的版本2以及公司2和条目的版本1)
  • 不应包括ABC123,因为其2个条目具有不同的公司代码。

我渴望了解你的见解......

4 个答案:

答案 0 :(得分:1)

如果我理解正确,您可以使用 CTE 查找表中的所有重复行,然后您可以使用CTE中的SELECT甚至添加更多操作。

WITH CTE AS(
   SELECT Id,DisplayId,Version,Company,Description,ProductData.Title
       RN = ROW_NUMBER()OVER(PARTITION BY DisplayId, Company ORDER BY p.Id DESC)
   FROM dbo.YourTable1
)

SELECT *
FROM CTE

答案 1 :(得分:1)

根据您的示例数据,您只需要JOIN表:

  SELECT 
    p.Id, p.DisplayId, p.Version, p.Company, d.Title
  FROM Products AS p
  INNER JOIN Companies AS c ON p.Company = c.Id
  INNER JOIN ProductData AS d ON d.ProductId = p.Id;

但如果您想要最新版本,可以使用ROW_NUMBER()

WITH CTE
AS
(
  SELECT 
    p.Id, p.DisplayId, p.Version, p.Company, d.Title,
    ROW_NUMBER() OVER(PARTITION BY p.DisplayId,p.Company ORDER BY p.Id DESC) AS RN
  FROM Products AS p
  INNER JOIN Companies AS c ON p.Company = c.Id
  INNER JOIN ProductData AS d ON d.ProductId = p.Id
)
SELECT * 
FROM CTE
WHERE RN = 1;

sample fiddle

| Id | DisplayId | Version | Company |    Title |
|----|-----------|---------|---------|----------|
|  5 |     12345 |       1 |       2 |  Title 5 |
|  7 |     12345 |       2 |      16 |  Title 7 |
| 10 |      XX45 |       1 |       5 | Title 10 |
| 11 |      XX45 |       1 |       7 | Title 11 |

答案 2 :(得分:0)

您必须先获取当前版本,然后才能看到DisplayID + Code显示的次数。然后基于此,您只能选择计数大于1的那些。然后,您可以在最终查询中INNER JOIN ProductData以获取标题。

WITH
MaxVersion AS --Get the current versions
(
    SELECT
        MAX(Version) AS Version,
        DisplayID,
        Company
    FROM
        #TmpProducts
    GROUP BY
        DisplayID,
        Company
)
,CTE AS
(
    SELECT
        p.DisplayID,
        c.Code,
        COUNT(*) AS RowCounter
    FROM
        #TmpProducts p
    INNER JOIN
        #TmpCompanies c
        ON
            c.ID = p.Company
    INNER JOIN
        MaxVersion mv
        ON
            mv.DisplayID = p.DisplayID
        AND mv.Version = p.Version
        AND mv.Company = p.Company
    GROUP BY
        p.DisplayID,
        c.Code
)

SELECT 
    p.*
FROM
    #TmpProducts p
INNER JOIN
    CTE c
    ON
        c.DisplayID = p.DisplayID
INNER JOIN
    MaxVersion mv
    ON
        mv.DisplayID = p.DisplayID
    AND mv.Company = p.Company
    AND mv.Version = p.Version
WHERE
    c.RowCounter > 1

答案 3 :(得分:0)

试试这个:

SELECT b.ID,displayid,version,company,productdata.title
FROM 
(select A.ID,a.displayid,version,a.company,rn,a.code, COUNT(displayid)  over (partition by displayid,code) cnt from
(select Prod.ID,displayid,version,company,Companies.code, Row_number() over (partition by displayid,company order by version desc) rn
from Prod inner join Companies on Prod.Company = Companies.id) a  
where a.rn=1) b inner join productdata on b.id = productdata.id  where cnt =2