使用SQL将XML结构转置/展平为列

时间:2013-03-01 16:39:44

标签: sql-server xml tsql pivot xquery-sql

我正在使用SQL Server(2008/2012),我知道很多搜索都有类似的答案,但我似乎无法为我的案例找到合适的示例/指针。

我在SQL Server表中有一个XML列,其中包含以下数据:

<Items>
 <Item>
  <FormItem>
    <Text>FirstName</Text>
    <Value>My First Name</Value>
  </FormItem>
  <FormItem>
    <Text>LastName</Text>
    <Value>My Last Name</Value>
  </FormItem>
  <FormItem>
    <Text>Age</Text>
    <Value>39</Value>
  </FormItem>
 </Item>
 <Item>
  <FormItem>
    <Text>FirstName</Text>
    <Value>My First Name 2</Value>
  </FormItem>
  <FormItem>
    <Text>LastName</Text>
    <Value>My Last Name 2</Value>
  </FormItem>
  <FormItem>
    <Text>Age</Text>
    <Value>40</Value>
  </FormItem>
 </Item>
</Items>

所以即使<FormItem>的结构相同,我也可以拥有多个(通常不超过20-30个)表单项。

我本质上是尝试以下面的格式从SQL返回一个查询,即基于/ FormItem / Text的动态列:

FirstName         LastName         Age    ---> More columns as new `<FormItem>` are returned
My First Name     My Last Name     39          Whatever value etc..
My First Name 2   My Last Name 2   40          

所以,目前我有以下内容:

select 
    Tab.Col.value('Text[1]','nvarchar(100)') as Question,
    Tab.Col.value('Value[1]','nvarchar(100)') as Answer
from
    @Questions.nodes('/Items/Item/FormItem') Tab(Col)

当然,我没有将我的XML行转换为列,并且显然已经用字段修复了..我一直在尝试各种“动态SQL”方法,其中SQL执行(在我的情况下){的独特选择{1}}节点,然后使用某种Pivot?但我似乎无法找到神奇的组合来返回我需要的结果作为每行(<Text>的集合中的<Item>)的动态列集。

我确信看到这么多非常相似的例子可以做到,但是解决方案再一次让我失望了!

感激不尽的任何帮助!!

3 个答案:

答案 0 :(得分:7)

解析XML是相当昂贵的,因此不是解析一次来构建动态查询,而是一次获取数据,您可以创建一个带有Name-Value列表的临时表,然后将其用作动态数据透视查询的源。
dense_rank用于创建要转移的ID 要在动态查询中构建列列表,它会使用for xml path('')技巧。

此解决方案要求您的表具有主键(ID)。如果您在变量中包含XML,则可以稍微简化一下。

select dense_rank() over(order by ID, I.N) as ID,
       F.N.value('(Text/text())[1]', 'varchar(max)') as Name,
       F.N.value('(Value/text())[1]', 'varchar(max)') as Value
into #T
from YourTable as T
  cross apply T.XMLCol.nodes('/Items/Item') as I(N)
  cross apply I.N.nodes('FormItem') as F(N)

declare @SQL nvarchar(max)
declare @Col nvarchar(max)

select @Col = 
  (
  select distinct ','+quotename(Name)
  from #T
  for xml path(''), type
  ).value('substring(text()[1], 2)', 'nvarchar(max)')

set @SQL = 'select '+@Col+'
            from #T
            pivot (max(Value) for Name in ('+@Col+')) as P'

exec (@SQL)

drop table #T

SQL Fiddle

答案 1 :(得分:2)

select Tab.Col.value('(FormItem[Text = "FirstName"]/Value)[1]', 'varchar(32)') as FirstName, 
        Tab.Col.value('(FormItem[Text = "LastName"]/Value)[1]', 'varchar(32)') as LastName, 
        Tab.Col.value('(FormItem[Text = "Age"]/Value)[1]', 'int') as Age
from @Questions.nodes('/Items/Item') Tab(Col)

答案 2 :(得分:2)

我想添加我的“自己的答案”真的只是为了完整性可能帮助别人..但是它绝对是基于@Mikael上面的帮助!所以,这只是为了完整性 - 所有对@Mikael的赞誉。

基本上我最终得到了以下过程。我需要选择一些数据/过滤器,并获得一些连接数据,并允许对某些输入参数进行一些布尔过滤。然后进入下一节,通过交叉应用创建关系数据的临时表和所需的xml节点。最后一步是转动结果/从选定的XML节点动态创建列。

CREATE PROCEDURE [dbo].[usp_RPT_ExtractFlattenentries]
    @CompanyID          int,
    @MainSelector       nvarchar(50) = null,
    @SecondarySelector      nvarchar(255) = null,
    @DateFrom           datetime = '01-jan-2012',
    @DateTo             datetime = '31-dec-2100',
    @SysReference       nvarchar(20) = null
AS
BEGIN
    SET NOCOUNT ON;

    --  Create the table var to hold the XML form data from the entries
    declare @FeedbackXml table (
        ID int identity primary key,
        XMLCol xml,
        CompanyName nvarchar(20),
        SysReference nvarchar(20),
        RecordDate datetime,
        EntryName  nvarchar(255),
        MainSelector nvarchar(50)
    )

    --  STEP 1: Get the raw submission data based on the params passed in
    --  *Note: The double casting is necessary as the "form" field is nvarchar (not varchar) and we need xml in UTF-8 format
    begin
        insert into @FeedbackXml
            (XMLCol, CompanyName, SysReference, RecordDate, EntryName, MainSelector)
        select cast(cast(e.form as nvarchar(max)) as xml), c.name, e.SysReference, e.RecordDate, e.name, e.wizard
        from 
            entries s
        left join
            companies o on e.companies = c.ID
        where 
            (@CompanyID = -1 or @CompanyID = e.companies)
        and
            (@MainSelector is null or @MainSelector = e.wizard)
        and
            (@SecondarySelector is null or @SecondarySelector = e.name)
        and
            (@SysReference is null or @SysReference = e.SysReference)
        and
            (e.RecordDate >= @DateFrom and e.RecordDate <= @DateTo)
    end

    --  STEP 2: Flatten the required XML structure to provide a base for the pivot, and include other fields we wish to output
    select dense_rank() over(order by ID) as ID,
            T.RecordDate, T.CompanyName, T.SysReference, T.EntryName, T.MainSelector,
            F.N.value('(FieldNameNode/text())[1]', 'nvarchar(max)') as FieldName,
            F.N.value('(FieldNameValue/text())[1]', 'nvarchar(max)') as FieldValue
    into #TempData
    from @FeedbackXml as T
        cross apply T.XMLCol.nodes('/root/companies/') as I(N) -- Xpath to the desired node start point
        cross apply I.N.nodes('company') as F(N) -- The actual node collection that forms the "field name" and "field value" data

    --  STEP 3: Pivot the #TempData table creating a dynamic column structure based on the selected XML nodes in step 2
    declare @SQL nvarchar(max)
    declare @Col nvarchar(max)

    select @Col = 
      (
      select distinct ','+quotename(FieldName)
      from #TempData
      for xml path(''), type
      ).value('substring(text()[1], 2)', 'nvarchar(max)')

    set @SQL = 'select CompanyName, SysReference, EntryName, MainSelector, RecordDate, '+@Col+'
                from #TempData
                pivot (max(FieldValue) for FieldName in ('+@Col+')) as P'

    exec (@SQL)
    drop table #TempData

END

同样,实际上只添加了这个答案,以便从我的角度提供完整的图片,并可能帮助其他人。