将两列中的动态值拆分为多列 - 删除重复项

时间:2016-07-13 06:33:32

标签: sql sql-server database sql-server-2012

我正在努力创建一个查询,该查询可以将列中的多个值拆分为多个列,以帮助“重复删除”数据集。

最好在下面的数据中解释,但基本上你会注意到一个间隔字段,它是ID,START,FINISH,DURATION,COD列的DENSE RANK。由于多个重叠的PSSID和CSSID值,这些间隔是重复的。我想知道是否有一种将重叠的PSSID和CSSID字段动态分割成多列的好方法......!好的,我的意思是......

示例数据:

ID  START   FINISH  DURA    COD INT PSSID   CSSID
A1  33.18   33.27   0.09    ST  15  N13045  NULL
A1  33.18   33.27   0.09    ST  15  N13046  NULL
A1  33.27   33.285  0.015   DU  16  N13046  NULL
A1  33.27   33.285  0.015   DU  16  NULL    N20015
A1  33.27   33.285  0.015   DU  16  NULL    N2001516
A1  33.27   33.285  0.015   DU  16  NULL    N20033
A1  33.285  33.35   0.065   BM  17  N13046  NULL
A1  33.285  33.35   0.065   BM  17  NULL    N20015
A1  33.285  33.35   0.065   BM  17  NULL    N2001516
A1  33.285  33.35   0.065   BM  17  NULL    N20033
A1  33.35   33.395  0.045   DM  18  N13046  NULL
A1  33.35   33.395  0.045   DM  18  NULL    N20015
A1  33.35   33.395  0.045   DM  18  NULL    N2001516
A1  33.35   33.395  0.045   DM  18  NULL    N20033
A1  33.395  33.44   0.045   DN  19  N13046  NULL
A1  33.395  33.44   0.045   DN  19  NULL    N20015
A1  33.395  33.44   0.045   DN  19  NULL    N2001516
A1  33.395  33.44   0.045   DN  19  NULL    N20033
A1  33.44   33.485  0.045   BM  20  N13046  NULL
A1  33.44   33.485  0.045   BM  20  NULL    N2001516
A1  33.44   33.485  0.045   BM  20  NULL    N20033
A1  33.44   33.485  0.045   BM  20  NULL    N20034
A1  33.485  33.51   0.025   DN  21  N13046  NULL
A1  33.485  33.51   0.025   DN  21  NULL    N2001516
A1  33.485  33.51   0.025   DN  21  NULL    N20033
A1  33.485  33.51   0.025   DN  21  NULL    N20034
A1  33.51   33.595  0.085   DB  22  N13046  NULL
A1  33.51   33.595  0.085   DB  22  NULL    N2001516
A1  33.51   33.595  0.085   DB  22  NULL    N20034
A1  33.595  33.665  0.07    DN  23  N13046  NULL
A1  33.595  33.665  0.07    DN  23  NULL    N2001516
A1  33.595  33.665  0.07    DN  23  NULL    N20034
A1  33.665  33.785  0.12    DB  24  NULL    N2001516
A1  33.785  33.79   0.005   YS  25  NULL    NULL
A1  33.79   33.83   0.04    BM  26  NULL    NULL

期望的输出:

ID  START   FINISH  DURA    COD INT PSSID1  PSSID2  CSSID1  CSSID2      CSSID3
A1  33.18   33.27   0.09    ST  15  N13046  N13045  NULL    NULL        NULL
A1  33.27   33.285  0.015   DU  16  N13046  NULL    N20015  N2001516    N20033
A1  33.285  33.35   0.065   BM  17  N13046  NULL    N20015  N2001516    N20033
A1  33.35   33.395  0.045   DM  18  N13046  NULL    N20015  N2001516    N20033
A1  33.395  33.44   0.045   DN  19  N13046  NULL    N20015  N2001516    N20033
A1  33.44   33.485  0.045   BM  20  N13046  NULL    N20034  N2001516    N20033
A1  33.485  33.51   0.025   DN  21  N13046  NULL    N20034  N2001516    N20033
A1  33.51   33.595  0.085   DB  22  N13046  NULL    N20034  N2001516    NULL
A1  33.595  33.665  0.07    DN  23  N13046  NULL    N20034  N2001516    NULL
A1  33.665  33.785  0.12    DB  24  NULL    NULL    NULL    N2001516    NULL
A1  33.785  33.79   0.005   YS  25  NULL    NULL    NULL    NULL        NULL
A1  33.79   33.83   0.04    BM  26  NULL    NULL    NULL    NULL        NULL

更糟糕的是,这只是样本数据的一小部分,给定间隔可能有三个以上的PSSID,CSSID字段(尽管这应该具有5的上限)。因此,查询需要是动态的以允许此操作。

我正在使用SQL Server 2012.上面提供了上述数据的架构:

CREATE TABLE #SampleData
    ([ID] varchar(2), [START] decimal(9,2), [FINISH] decimal(9,2), [DURA] decimal(9,2), [COD] varchar(2), [INT] int, [PSSID] varchar(6), [CSSID] varchar(8))
;

INSERT INTO #SampleData
    ([ID], [START], [FINISH], [DURA], [COD], [INT], [PSSID], [CSSID])
VALUES
    ('A1', 33.18, 33.27, 0.09, 'ST', 15, 'N13045', NULL),
    ('A1', 33.18, 33.27, 0.09, 'ST', 15, 'N13046', NULL),
    ('A1', 33.27, 33.285, 0.015, 'DU', 16, 'N13046', NULL),
    ('A1', 33.27, 33.285, 0.015, 'DU', 16, NULL, 'N20015'),
    ('A1', 33.27, 33.285, 0.015, 'DU', 16, NULL, 'N2001516'),
    ('A1', 33.27, 33.285, 0.015, 'DU', 16, NULL, 'N20033'),
    ('A1', 33.285, 33.35, 0.065, 'BM', 17, 'N13046', NULL),
    ('A1', 33.285, 33.35, 0.065, 'BM', 17, NULL, 'N20015'),
    ('A1', 33.285, 33.35, 0.065, 'BM', 17, NULL, 'N2001516'),
    ('A1', 33.285, 33.35, 0.065, 'BM', 17, NULL, 'N20033'),
    ('A1', 33.35, 33.395, 0.045, 'DM', 18, 'N13046', NULL),
    ('A1', 33.35, 33.395, 0.045, 'DM', 18, NULL, 'N20015'),
    ('A1', 33.35, 33.395, 0.045, 'DM', 18, NULL, 'N2001516'),
    ('A1', 33.35, 33.395, 0.045, 'DM', 18, NULL, 'N20033'),
    ('A1', 33.395, 33.44, 0.045, 'DN', 19, 'N13046', NULL),
    ('A1', 33.395, 33.44, 0.045, 'DN', 19, NULL, 'N20015'),
    ('A1', 33.395, 33.44, 0.045, 'DN', 19, NULL, 'N2001516'),
    ('A1', 33.395, 33.44, 0.045, 'DN', 19, NULL, 'N20033'),
    ('A1', 33.44, 33.485, 0.045, 'BM', 20, 'N13046', NULL),
    ('A1', 33.44, 33.485, 0.045, 'BM', 20, NULL, 'N2001516'),
    ('A1', 33.44, 33.485, 0.045, 'BM', 20, NULL, 'N20033'),
    ('A1', 33.44, 33.485, 0.045, 'BM', 20, NULL, 'N20034'),
    ('A1', 33.485, 33.51, 0.025, 'DN', 21, 'N13046', NULL),
    ('A1', 33.485, 33.51, 0.025, 'DN', 21, NULL, 'N2001516'),
    ('A1', 33.485, 33.51, 0.025, 'DN', 21, NULL, 'N20033'),
    ('A1', 33.485, 33.51, 0.025, 'DN', 21, NULL, 'N20034'),
    ('A1', 33.51, 33.595, 0.085, 'DB', 22, 'N13046', NULL),
    ('A1', 33.51, 33.595, 0.085, 'DB', 22, NULL, 'N2001516'),
    ('A1', 33.51, 33.595, 0.085, 'DB', 22, NULL, 'N20034'),
    ('A1', 33.595, 33.665, 0.07, 'DN', 23, 'N13046', NULL),
    ('A1', 33.595, 33.665, 0.07, 'DN', 23, NULL, 'N2001516'),
    ('A1', 33.595, 33.665, 0.07, 'DN', 23, NULL, 'N20034'),
    ('A1', 33.665, 33.785, 0.12, 'DB', 24, NULL, 'N2001516'),
    ('A1', 33.785, 33.79, 0.005, 'YS', 25, NULL, NULL),
    ('A1', 33.79, 33.83, 0.04, 'BM', 26, NULL, NULL)
;

感谢您的所有帮助!

2 个答案:

答案 0 :(得分:4)

您已经定义了创建INT列的组。我们可以使用它,分别为pivotPSS制作CSS,然后加入它们。

SELECT *
INTO #DataSourcePSS
FROM
(
    SELECT [INT]
          ,[PSSID]
          ,CONCAT('PSSID',ROW_NUMBER() OVER (PARTITION BY [INT] ORDER BY [PSSID] DESC)) AS [RowID]
    FROM #SampleData
) DS
PIVOT
(
    MAX([PSSID]) FOR RowID IN ([PSSID1], [PSSID2], [PSSID3], [PSSID4], [PSSID5])
) PVT

SELECT *
INTO #DataSourceCSS
FROM
(
    SELECT [INT]
          ,[CSSID]
          ,CONCAT('CSSID', ROW_NUMBER() OVER (PARTITION BY [INT] ORDER BY [CSSID] DESC)) AS [RowID] 
    FROM #SampleData
) DS
PIVOT
(
    MAX([CSSID]) FOR RowID IN ([CSSID1], [CSSID2], [CSSID3], [CSSID4], [CSSID5])
) PVT;

WITH DataSourceSD AS 
(
    SELECT DISTINCT [ID], [START], [FINISH], [DURA], [COD], [INT]
    FROM #SampleData
)
SELECT SD.*
      ,PSS.[PSSID1],PSS.[PSSID2],PSS.[PSSID3],PSS.[PSSID4],PSS.[PSSID5]
      ,CSS.[CSSID1],CSS.[CSSID2],CSS.[CSSID3],CSS.[CSSID4],CSS.[CSSID5]
FROM DataSourceSD SD
INNER JOIN #DataSourcePSS PSS
    ON SD.[INT] = PSS.[INT]
INNER JOIN #DataSourceCSS CSS
    ON SD.[INT] = CSS.[INT]
ORDER BY SD.[INT];

DROP TABLE #DataSourceCSS;
DROP TABLE #DataSourcePSS;
DROP TABLE #SampleData;

enter image description here

由于每组最多可以有五个值,因此我会调整五个值。在这种情况下,您可以拥有没有任何值的列。如果这不是OK,则需要使用动态PIVOT。

答案 1 :(得分:2)

尝试这样

根据您的评论更新

现在根据PSSID和CSSID的不同值的内部编号对其进行排序。除了期望的输出,我得到4个值,因为N20034在新列中。我没有看到任何逻辑如何决定一个值应该弹出哪一列......关键是编号。我的第一种方法根据它们与“父”的位置相对应的值来对这些值进行编号,这种新方法对它们进行排序,以便每个都进入一个不同的列......

WITH Numbered AS
(
    SELECT *
          ,CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END AS ColumnName  
          ,ROW_NUMBER() OVER(PARTITION BY ID,Start,Finish,Dura,COD,[INT],CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END ORDER BY (SELECT NULL)) AS SortNr  
    FROM #SampleData
)
,DistinctPSSIDS AS
(
    SELECT DISTINCT
           DENSE_RANK() OVER(ORDER BY PSSID) AS SortNr
          ,PSSID
    FROM #SampleData
    WHERE PSSID IS NOT NULL
)
,DistinctCSSIDS AS
(
    SELECT DISTINCT
           DENSE_RANK() OVER(ORDER BY CSSID) AS SortNr
          ,CSSID
    FROM #SampleData
    WHERE CSSID IS NOT NULL
)
SELECT ID,Start,Finish,Dura,COD,[INT]
      ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=1 THEN n.PSSID END) AS PSSID1
      ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=2 THEN n.PSSID END) AS PSSID2
      ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=3 THEN n.PSSID END) AS PSSID3
      ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=4 THEN n.PSSID END) AS PSSID4
      ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=5 THEN n.PSSID END) AS PSSID5
      ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=1 THEN n.CSSID END) AS CSSID1
      ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=2 THEN n.CSSID END) AS CSSID2
      ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=3 THEN n.CSSID END) AS CSSID3
      ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=4 THEN n.CSSID END) AS CSSID4
      ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=5 THEN n.CSSID END) AS CSSID5
FROM Numbered AS n
LEFT JOIN DistinctPSSIDS AS dp ON dp.PSSID=n.PSSID
LEFT JOIN DistinctCSSIDS AS dc ON dc.CSSID=n.CSSID
GROUP BY ID,Start,Finish,Dura,COD,[INT]

结果

+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| ID | Start | Finish | Dura | COD | INT | PSSID1 | PSSID2 | PSSID3 | PSSID4 | PSSID5 | CSSID1 | CSSID2   | CSSID3 | CSSID4 | CSSID5 |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.18 | 33.27  | 0.09 | ST  | 15  | N13045 | N13046 | NULL   | NULL   | NULL   | NULL   | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.27 | 33.29  | 0.02 | DU  | 16  | NULL   | N13046 | NULL   | NULL   | NULL   | N20015 | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.29 | 33.35  | 0.07 | BM  | 17  | NULL   | N13046 | NULL   | NULL   | NULL   | N20015 | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.35 | 33.40  | 0.05 | DM  | 18  | NULL   | N13046 | NULL   | NULL   | NULL   | N20015 | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.40 | 33.44  | 0.05 | DN  | 19  | NULL   | N13046 | NULL   | NULL   | NULL   | N20015 | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.44 | 33.49  | 0.05 | BM  | 20  | NULL   | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | N20033 | N20034 | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.49 | 33.51  | 0.03 | DN  | 21  | NULL   | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | N20033 | N20034 | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.51 | 33.60  | 0.09 | DB  | 22  | NULL   | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | NULL   | N20034 | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.60 | 33.67  | 0.07 | DN  | 23  | NULL   | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | NULL   | N20034 | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.67 | 33.79  | 0.12 | DB  | 24  | NULL   | NULL   | NULL   | NULL   | NULL   | NULL   | N2001516 | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.79 | 33.79  | 0.01 | YS  | 25  | NULL   | NULL   | NULL   | NULL   | NULL   | NULL   | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+
| A1 | 33.79 | 33.83  | 0.04 | BM  | 26  | NULL   | NULL   | NULL   | NULL   | NULL   | NULL   | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+

previous:带有GROUP BY和aggregate的
WITH Numbered AS
(
    SELECT *
          ,CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END AS ColumnName  
          ,ROW_NUMBER() OVER(PARTITION BY ID,Start,Finish,Dura,COD,[INT],CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END ORDER BY (SELECT NULL)) AS SortNr  
    FROM #SampleData
)
SELECT ID,Start,Finish,Dura,COD,[INT]
      ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=1 THEN PSSID END) AS PSSID1
      ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=2 THEN PSSID END) AS PSSID2
      ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=3 THEN PSSID END) AS PSSID3
      ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=4 THEN PSSID END) AS PSSID4
      ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=5 THEN PSSID END) AS PSSID5
      ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=1 THEN CSSID END) AS CSSID1
      ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=2 THEN CSSID END) AS CSSID2
      ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=3 THEN CSSID END) AS CSSID3
      ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=4 THEN CSSID END) AS CSSID4
      ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=5 THEN CSSID END) AS CSSID5
FROM Numbered
GROUP BY ID,Start,Finish,Dura,COD,[INT]

结果

+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| ID | Start | Finish | Dura | COD | INT | PSSID1 | PSSID2 | PSSID3 | PSSID4 | PSSID5 | CSSID1   | CSSID2   | CSSID3 | CSSID4 | CSSID5 |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.18 | 33.27  | 0.09 | ST  | 15  | N13045 | N13046 | NULL   | NULL   | NULL   | NULL     | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.27 | 33.29  | 0.02 | DU  | 16  | N13046 | NULL   | NULL   | NULL   | NULL   | N20015   | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.29 | 33.35  | 0.07 | BM  | 17  | N13046 | NULL   | NULL   | NULL   | NULL   | N20015   | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.35 | 33.40  | 0.05 | DM  | 18  | N13046 | NULL   | NULL   | NULL   | NULL   | N20015   | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.40 | 33.44  | 0.05 | DN  | 19  | N13046 | NULL   | NULL   | NULL   | NULL   | N20015   | N2001516 | N20033 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.44 | 33.49  | 0.05 | BM  | 20  | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | N20033   | N20034 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.49 | 33.51  | 0.03 | DN  | 21  | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | N20033   | N20034 | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.51 | 33.60  | 0.09 | DB  | 22  | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | N20034   | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.60 | 33.67  | 0.07 | DN  | 23  | N13046 | NULL   | NULL   | NULL   | NULL   | N2001516 | N20034   | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.67 | 33.79  | 0.12 | DB  | 24  | NULL   | NULL   | NULL   | NULL   | NULL   | N2001516 | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.79 | 33.79  | 0.01 | YS  | 25  | NULL   | NULL   | NULL   | NULL   | NULL   | NULL     | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+
| A1 | 33.79 | 33.83  | 0.04 | BM  | 26  | NULL   | NULL   | NULL   | NULL   | NULL   | NULL     | NULL     | NULL   | NULL   | NULL   |
+----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+