SQL:比较父级层次结构,如果重复则排除

时间:2015-06-24 10:53:36

标签: sql duplicates grouping sql-server-2014

我想从我的数据中排除重复项,除非副本是新分组/系列的一部分。

CREATE TABLE #MYTEMPTABLE
(CHILD VARCHAR(2), PARENT VARCHAR(2),YEAR INT)
INSERT INTO #MYTEMPTABLE (CHILD, PARENT,YEAR)
VALUES 
('1B','1A',2014),
('1A','1A',2014),
('2B','2A',2014),
('2A','2A',2014),
('3A','3A',2014),
('3B','3A',2014),
 also would have ('3B','3B',2014)
('3C','3B',2014),
('4A','4A',2014),
('4B','4A',2014),
 and ('4B','4B',2014)
('4C','4B',2014)

CREATE TABLE #MYTEMPTABLE2
(CHILD VARCHAR(2), PARENT VARCHAR(2),YEAR INT)

INSERT INTO #MYTEMPTABLE2  (CHILD, PARENT,YEAR)
VALUES 
('1A','1A',2015),
('1C', '1A',2015),
('2B','2A',2015),
('2A', '2A',2015),
('3A','3A',2015),
('3B','3A',2015),
  ('3B','3B',2014)
('3C','3B',2015),
('4A','4A',2015),
('4B','4A',2015),
  ('4B','4B',2015),
('4D','4B',2015)

我现在想要的是什么:
      1B,1A
      1A,1A
      2B,2A
      2A,2A
      3A,3A
      3B,3A
      3B,3B
      3C,3B
      4A,4A
      4B,4A
      4B,4B
      4C,4B
      1A,1A
      1C,1A
      4B,4B
      4D,4B

2 个答案:

答案 0 :(得分:0)

您可以将查询编写为:

--Step2: Get all data from first Table
SELECT PARENT,CHILD
FROM #MYTEMPTABLE
UNION ALL -- to allow valid duplicates
SELECT PARENT,CHILD
FROM #MYTEMPTABLE2 
WHERE PARENT IN (
-- Step1:Get parents which have a new Seq in second table
SELECT PARENT
FROM #MYTEMPTABLE2 T
WHERE NOT EXISTS ( SELECT * FROM #MYTEMPTABLE T1 
WHERE T.PARENT = T1.PARENT AND T.CHILD = T1.CHILD 
AND T.YEAR <> T1.YEAR)
)

答案 1 :(得分:0)

这是一个艰难的。我不是百分之百肯定它是对的,但可能需要大学学位数学家和sql guru在一起做到这一点。它似乎根据您的要求正确显示。我简化并解释了下面的查询:

See SQL Fiddle

With MYTEMPTABLEHierarchy AS (
      Explode the entire Hierarchy for #MYTEMPTABLE. i.e. work out which parent and child relationship is on which hierarchical level
    ),

MYTEMPTABLEHierarchy2 AS (
      Explode the entire Hierarchy for #MYTEMPTABLE2. i.e. work out which parent and child relationship is on which hierarchical level
    ),

EqualHierarchy AS (
      Take the rows that are the same in both hierarchies.
    ),

DifferentHierarchy AS (
      Take the rows that are different in both hierarchies.
    ),

DuplicateAncestorHierarchy AS (
    Use the parent rows that have a different hierarchy and find all of their parents which become the duplicates.
)
Select * from EqualHierarchy
UNION ALL
Select * from DifferentHierarchy
UNION ALL
Select * from DuplicateAncestorHierarchy
Order by HLevel, PARENT, CHILD