将结构未知的表转换为键/值

时间:2019-05-08 09:58:55

标签: sql-server tsql pivot-table key-value

我们从分析师那里收到的报告数据采用表格格式,具有任意结构。我们所知道的是,每一行都有一个CustomerId列。但是其他的,我们不知道,并且每次都会变化。

接收此数据的目标系统仅采用键/值格式,因此我们必须将报告表转换为键/值。

因此,例如,源报告表具有以下结构:

CREATE TABLE [dbo].[SampleSourceTable](
    [CustomerId] [bigint] NULL,
    [Column1] [nchar](10) NULL,
    [Column2] [int] NULL,
    [Column3] [datetime] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[SampleSourceTable] ([CustomerId], [Column1], [Column2], [Column3]) VALUES (1, N'aaa', 123, CAST(N'2019-01-01T00:00:00.000' AS DateTime))
GO
INSERT [dbo].[SampleSourceTable] ([CustomerId], [Column1], [Column2], [Column3]) VALUES (2, N'bbb', 456, CAST(N'2018-01-01T00:00:00.000' AS DateTime))
GO

enter image description here

我们希望将这些数据转换为以下结构:

CREATE TABLE [dbo].[SampleDestinationTable](
    [CustomerId] [bigint] NULL,
    [Attribute] [nvarchar](255) NULL,
    [Value] [nvarchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (1, N'Column1', N'aaa')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (1, N'Column2', N'123')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (1, N'Column3', N'2019-01-01 00:00:00.000')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (2, N'Column1', N'bbb')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (2, N'Column2', N'456')
GO
INSERT [dbo].[SampleDestinationTable] ([CustomerId], [Attribute], [Value]) VALUES (2, N'Column3', N'2018-01-01 00:00:00.000')
GO

enter image description here

然而,这里的挑战是源报告表没有固定的结构。

首先,我考虑过使用游标遍历每一行,然后使用嵌套游标遍历该行中的所有列。但显然,there is no way of processing a row with an unknown structure using cursors。所以现在,我想知道是否可以使用PIVOT / UNPIVOT。但是话又说回来,我认为他们也需要列列表。

我正在运行SQL Server 2017。

如何转换结构未知的数据?

2 个答案:

答案 0 :(得分:2)

一种可能的方法是使用INFORMATION_SCHEMA.COLUMNS中的信息来生成动态语句:

-- Declarations
DECLARE @stm nvarchar(max)

-- Dynamic part 
SELECT 
    @stm = STUFF((
        SELECT CONCAT(
            N' UNION ALL SELECT CustomerID, ''', 
            [COLUMN_NAME],
            N''' AS [Attribute], CONVERT(nvarchar(max), ',
            QUOTENAME([COLUMN_NAME]),
            CASE 
                WHEN DATA_TYPE = 'datetime' THEN N', 121'
                -- Add additional conversion rules for other data types
                ELSE N''
            END,
            N') AS [Value]', 
            N' FROM [SampleSourceTable]'
        )
        FROM INFORMATION_SCHEMA.COLUMNS
        WHERE (TABLE_NAME = 'SampleSourceTable') AND (COLUMN_NAME <> 'CustomerId')
    FOR XML PATH('')
    ), 1, 11, N'')

-- Whole statement and execution
SET @stm = @stm + N'ORDER BY CustomerID'
PRINT @stm 
EXEC (@stm)

输出:

CustomerID  Attribute   Value
1           Column1     aaa       
1           Column2     123
1           Column3     2019-01-01 00:00:00.000
2           Column3     2018-01-01 00:00:00.000
2           Column2     456
2           Column1     bbb       

答案 1 :(得分:0)

啊,您打开了第二个问题,我只是在您的第一个问题上回答了...

因此,我将使用此位置提供与其他答案相同的技术,但不需要任何动态创建的SQL。试试看:

DECLARE @xml XML =(SELECT TOP 10 o.object_id,o.* FROM sys.objects o FOR XML RAW, ELEMENTS XSINIL);

SELECT r.value('*[1]/text()[1]','nvarchar(max)') AS RowID
        ,c.value('local-name(.)','nvarchar(max)') AS ColumnKey
        ,c.value('text()[1]','nvarchar(max)') AS ColumnValue
FROM @xml.nodes('/row') A(r)
CROSS APPLY A.r.nodes('*[position()>1]') B(c);

集合的第一列将作为 RowID 返回。如果这是不正确的,则可以通过执行与我之前强制o.object_id相同的操作来强制执行此操作。结果的所有列都将作为EAV返回。

部分结果

+-------+---------------------+-------------------------+
| RowID | ColumnKey           | ColumnValue             |
+-------+---------------------+-------------------------+
| 3     | name                | sysrscols               |
+-------+---------------------+-------------------------+
| 3     | object_id           | 3                       |
+-------+---------------------+-------------------------+
| 3     | principal_id        | NULL                    |
+-------+---------------------+-------------------------+
| 3     | schema_id           | 4                       |
+-------+---------------------+-------------------------+
| 3     | parent_object_id    | 0                       |
+-------+---------------------+-------------------------+
| 3     | type                | S                       |
+-------+---------------------+-------------------------+
| 3     | type_desc           | SYSTEM_TABLE            |
+-------+---------------------+-------------------------+
| 3     | create_date         | 2017-08-22T19:38:02.860 |
+-------+---------------------+-------------------------+
| 3     | modify_date         | 2017-08-22T19:38:02.867 |
+-------+---------------------+-------------------------+
| 3     | is_ms_shipped       | 1                       |
+-------+---------------------+-------------------------+
| 3     | is_published        | 0                       |
+-------+---------------------+-------------------------+
| 3     | is_schema_published | 0                       |
+-------+---------------------+-------------------------+
| 5     | name                | sysrowsets              |
+-------+---------------------+-------------------------+
| ... more rows ...