在sql server中将数据从多个记录转换为单个记录

时间:2012-03-17 23:07:05

标签: sql sql-server-2008 pivot

我需要将多个类似的记录转换为单个记录。 最多可以有10行需要组合。 需要组合的每组行具有相同的ID。并且行数据的值是无关的(实际上是GUID)。 数据如下所示:

表A

ID    C1   C2    C3  
ID1   x    x     x  
ID1   y    y     y  
ID2   y    y     y  
ID2   x    x     x  
ID2   y    y     y  
ID2   y    y     y  
ID3   x    x     x  
ID3   y    y     y  
ID3   y    y     y  

我需要转换为此结构,每个ID只有一条记录。根据具有相同ID(大约10)的记录数,可以有N列。

表B

ID     C1     C2     C3     C1A     C2A     C3A     C1B     C2B     C3B
ID1    x      x      x      y       y       y       null    null    null
ID2    y      y      y      x       x       x       y       y       y
ID3    x      x      x      y       y       y       y       y       y

我根本无法修改表B.只需合并或插入即可 我使用的是SQL Server 2008 R2,表A的数量大约是一百万条记录 任何帮助都很受欢迎。

更新:添加真实的表定义。

这是TableA创建脚本:

SET ANSI_NULLS ON
    GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[IntervalPivotTable](
    [UID] [uniqueidentifier] NOT NULL,
    [ServiceHash] [int] NULL,
    [IntervalID] [nvarchar](50) NULL,
    [IntervalTypeID] [nvarchar](50) NULL,
    [IntervalGroupID] [nvarchar](50) NULL,
    [DrivingConditionID] [nvarchar](50) NULL,
 CONSTRAINT [PK_IntervalPivotTable] PRIMARY KEY CLUSTERED 
(
    [UID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  =     ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[IntervalPivotTable] ADD  CONSTRAINT [DF_IntervalPivotTable_UID]  DEFAULT         (newid()) FOR [UID]
GO

这是TableB创建脚本:     SET ANSI_NULLS ON     GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[PivotedIntervals](
    [ServiceHash] [int] NULL,
    [IntervalID_0] [nvarchar](50) NULL,
    [IntervalTypeID_0] [nvarchar](50) NULL,
    [IntervalGroupID_0] [nvarchar](50) NULL,
    [DrivingConditionID_0] [nvarchar](50) NULL,
    [IntervalID_1] [nvarchar](50) NULL,
    [IntervalTypeID_1] [nvarchar](50) NULL,
    [IntervalGroupID_1] [nvarchar](50) NULL,
    [DrivingConditionID_1] [nvarchar](50) NULL,
    [IntervalID_2] [nvarchar](50) NULL,
    [IntervalTypeID_2] [nvarchar](50) NULL,
    [IntervalGroupID_2] [nvarchar](50) NULL,
    [DrivingConditionID_2] [nvarchar](50) NULL,
    [IntervalID_3] [nvarchar](50) NULL,
    [IntervalTypeID_3] [nvarchar](50) NULL,
    [IntervalGroupID_3] [nvarchar](50) NULL,
    [DrivingConditionID_3] [nvarchar](50) NULL,
    [IntervalID_4] [nvarchar](50) NULL,
    [IntervalTypeID_4] [nvarchar](50) NULL,
    [IntervalGroupID_4] [nvarchar](50) NULL,
    [DrivingConditionID_4] [nvarchar](50) NULL,
    [IntervalID_5] [nvarchar](50) NULL,
    [IntervalTypeID_5] [nvarchar](50) NULL,
    [IntervalGroupID_5] [nvarchar](50) NULL,
    [DrivingConditionID_5] [nvarchar](50) NULL,
    [IntervalID_6] [nvarchar](50) NULL,
    [IntervalTypeID_6] [nvarchar](50) NULL,
    [IntervalGroupID_6] [nvarchar](50) NULL,
    [DrivingConditionID_6] [nvarchar](50) NULL,
    [IntervalID_7] [nvarchar](50) NULL,
    [IntervalTypeID_7] [nvarchar](50) NULL,
    [IntervalGroupID_7] [nvarchar](50) NULL,
    [DrivingConditionID_7] [nvarchar](50) NULL,
    [IntervalID_8] [nvarchar](50) NULL,
    [IntervalTypeID_8] [nvarchar](50) NULL,
    [IntervalGroupID_8] [nvarchar](50) NULL,
    [DrivingConditionID_8] [nvarchar](50) NULL,
    [IntervalID_9] [nvarchar](50) NULL,
    [IntervalTypeID_9] [nvarchar](50) NULL,
    [IntervalGroupID_9] [nvarchar](50) NULL,
    [DrivingConditionID_9] [nvarchar](50) NULL,
    [IntervalID_10] [nvarchar](50) NULL,
    [IntervalTypeID_10] [nvarchar](50) NULL,
    [IntervalGroupID_10] [nvarchar](50) NULL,
    [DrivingConditionID_10] [nvarchar](50) NULL 
) ON [PRIMARY]

GO

1 个答案:

答案 0 :(得分:3)

您可能想尝试使用unpivot/pivot组合。 Unpivot将TableA转换为仅包含三列的行,id,ColumnID和guid,例如

ID1, C1_0, x
ID1, C2_0, x
ID1, C3_0, x
ID1, C1_1, x
ID1, C2_1, x
ID1, C3_1, x

为每个id / column组合添加唯一编号(row_number over(...)part),并为TableA中的每一行执行此操作。 Pivot将通过id将它们转换为单行:

select *
from
(
  select id, 
         code 
         + '_' 
         + convert(varchar(10), ColumnID) - 1 ColumnID,
         guid
  from 
  (
    select TableA.*,
           row_number() over (partition by ID
                              order by TableA_PK) - 1 ColumnID
      from TableA
  )
  unpivot
  (
    guid for code in (c1, c2, c3)
  ) as u
) UnpivotedTable
pivot
(
  min(guid)
  for columnid in ([c1_0], [c2_0], [c3_0], [c1_1], [c2_1], [c3_1])
) PivotedTable

不要因为min(guid)的存在而感到震惊。这是因为枢轴坚持集合功能。由于每个ID / ColumnID组合只有一个guid,因此不会有缺失值。要验证,请将min(guid)替换为count(guid)并检查大于1的值。

更新:为了避免混合来自相同ID的不同行,我必须将row_number()的order by子句更改为TableA的主键。如果表没有PK,则应该转换输入,使其包含row_number()作为代理主键:

(
       select TableA.*, row_number() over (order by ID) TableA_PK
         from TableA
) TableAWithAPrimaryKey

并使用派生表而不是原始表。

更新2:

我的错误是将row_number()放在unpivot select中。显然,必须在转换之前获取ID的行号。这是一个使用原始表的轴:

select *
from
(
  select ServiceHash, 
         code 
         + '_' 
         + convert(varchar(10), ColumnID) ColumnID,
         guid
  from 
  (
    select IntervalPivotTable.*,
           row_number() over (partition by ServiceHash 
                            order by UID) - 1 ColumnID
      from IntervalPivotTable
  ) a
  unpivot
  (
    guid for code in 
    (
      IntervalID,
      IntervalTypeID,
      IntervalGroupID,
      DrivingConditionID
    )
  ) as u
) UnpivotedTable
pivot
(
  min(guid)
  for columnid in 
  (
    IntervalID_0,
    IntervalTypeID_0,
    IntervalGroupID_0,
    DrivingConditionID_0,
    IntervalID_1,
    IntervalTypeID_1,
    IntervalGroupID_1,
    DrivingConditionID_1,
    IntervalID_2,
    IntervalTypeID_2,
    IntervalGroupID_2,
    DrivingConditionID_2
 -- Continue list of pivoted columns up to _10 here
  )
) PivotedTable