我有一个数据集,其中缓慢变化的数据以以下格式存储,键值对存储在行中: 此处的关键是ID列。每个键都有一组属性,这些属性存储在“维度”列中,具有对应的值(键-值对) StartDate和EndDate列提供特定属性的有效性。总会有一个startDate。 EndDate-如果其为NULL,则为ID的此属性的当前值。如果此处有日期,则特定的属性在这些开始日期和结束日期之间具有相应的值。
如以下示例所示,对于ID-FT96, 在'16 / 01/2019'上说属性'Group'的值是'Group2' '01 / 02/2019'属性'Group'的值为'Group22',但截至目前,Group为'Group2'。 如果EndDate为NULL,则表明截至当天的属性值。
StartDate | EndDate | ID | Dimension | Value
------------|------------|--------|-----------|-------------
02/11/2018 | 19/11/2018 | FTID15 | Name | Name1
02/11/2018 | NULL | FTID15 | Status | Active
02/11/2018 | NULL | FTID15 | Group | Group1
02/11/2018 | NULL | FTID15 | Sub Group | SUB Group1
20/11/2018 | 19/12/2018 | FTID15 | Name | Name2
20/12/2018 | 23/01/2019 | FTID15 | Name | Name3
24/01/2019 | 20/02/2019 | FTID15 | Name | Name4
21/02/2019 | 27/02/2019 | FTID15 | Name | Name5
28/02/2019 | NULL | FTID15 | Sub Group | SUB Group2
02/11/2018 | 19/11/2018 | FTID12 | Name | Namex1
02/11/2018 | NULL | FTID12 | Status | Active
02/11/2018 | NULL | FTID12 | Group | Group2
02/11/2018 | NULL | FTID12 | Sub Group | SUB Group13
20/11/2018 | NULL | FTID12 | Name | Namex2
02/11/2018 | NULL | FT96 | Name | NameYY
02/11/2018 | NULL | FT96 | Status | Active
02/11/2018 | 27/01/2019 | FT96 | Group | Group2
02/11/2018 | 27/01/2019 | FT96 | Sub Group | SUB Group1
28/01/2019 | 05/02/2019 | FT96 | Group | Group22
28/01/2019 | NULL | FT96 | Sub Group | SUB Group22
06/02/2019 | 11/02/2019 | FT96 | Group | Group1
12/02/2019 | NULL | FT96 | Group | Group2
我需要一些帮助,以SQL形式转换此数据以以下格式存储。 在此,结果数据集应将每个“维度”作为一个单独的列,并将其相应的值作为该列的值。 对于任何维度值中的每个更改,都应该有一行,以便在一行中的更新之间提供所有维度的值的快照。
结果输出应如下所示。
StartDate | EndDate | ID | Name | Status | Group | Sub Group
------------|------------|--------|--------|--------|---------|-------------
02/11/2018 | 19/11/2018 | FTID15 | Name1 | Active | Group1 | SUB Group1
20/11/2018 | 19/12/2018 | FTID15 | Name2 | Active | Group1 | SUB Group1
20/12/2018 | 23/01/2019 | FTID15 | Name3 | Active | Group1 | SUB Group1
24/01/2019 | 20/02/2019 | FTID15 | Name4 | Active | Group1 | SUB Group1
21/02/2019 | 27/02/2019 | FTID15 | Name5 | Active | Group1 | SUB Group1
28/02/2019 | NULL | FTID15 | Name5 | Active | Group1 | SUB Group2
02/11/2018 | 19/11/2018 | FTID12 | Namex1 | Active | Group2 | SUB Group13
20/11/2018 | NULL | FTID12 | Namex2 | Active | Group2 | SUB Group13
2018-11-02 | 2019-01-27 | FT96 | NameYY | Active | Group2 | SUB Group1
2019-01-28 | 2019-02-05 | FT96 | NameYY | Active | Group22 | SUB Group22
2019-02-06 | 2019-02-11 | FT96 | NameYY | Active | Group1 | SUB Group22
2019-02-12 | NULL | FT96 | NameYY | Active | Group2 | SUB Group22
此处,尺寸值不仅限于示例中提到的4。这可能会有所不同,并且无论尺寸大小如何都需要自动进行转换。
答案 0 :(得分:0)
您可以尝试一下。 PIVOT和一些窗口功能可以解决您的问题。
SELECT
StartDate,
EndDate,
ID,
ISNULL([Name], FIRST_VALUE([Name]) OVER(PARTITION BY ID ORDER BY StartDate)) AS [Name],
ISNULL([Status], FIRST_VALUE([Status]) OVER(PARTITION BY ID ORDER BY StartDate)) AS [Status],
ISNULL([Group], FIRST_VALUE([Group]) OVER(PARTITION BY ID ORDER BY StartDate)) AS [Group],
ISNULL([Sub Group], FIRST_VALUE([Sub Group]) OVER(PARTITION BY ID ORDER BY StartDate))AS [Sub Group]
FROM (
SELECT StartDate,
ISNULL(EndDate, MAX(EndDate) OVER(PARTITION BY StartDate,ID )) AS EndDate,
ID, Dimension, Value
FROM MyTable
) SRC
PIVOT(MAX( Value) FOR Dimension IN ([Name], [Status], [Group], [Sub Group])) PVT
order by ID desc, StartDate
动态版本:
DECLARE @Columns NVARCHAR(MAX) =''
DECLARE @PivotIn NVARCHAR(MAX) =''
SELECT
@PivotIn = CONCAT(@PivotIn ,', ', QUOTENAME(Dimension))
, @Columns = CONCAT(@Columns , ', ', 'ISNULL(',QUOTENAME(Dimension),', FIRST_VALUE(',QUOTENAME(Dimension),') OVER(PARTITION BY ID ORDER BY StartDate)) AS ',QUOTENAME(Dimension),'')
FROM ( SELECT DISTINCT Dimension FROM MyTable ) AS X
DECLARE @SqlQuery NVARCHAR(MAX) = 'SELECT
StartDate,
EndDate,
ID ' +
@Columns
+' FROM
(
SELECT StartDate,
ISNULL(EndDate, MAX(EndDate) OVER(PARTITION BY StartDate,ID )) AS EndDate,
ID, Dimension, Value
FROM MyTable
) SRC
PIVOT(MAX( Value) FOR Dimension IN (' + STUFF(@PivotIn,1,1,'') + ')) PVT
ORDER BY ID DESC, StartDate'
EXEC sp_executesql @SqlQuery