SQL查询:分层合并

时间:2010-08-21 16:50:21

标签: sql-server-2005 hierarchy coalesce

我有一个定义层次结构的表:

Create Table [example] (
    id          Integer   Not Null Primary Key,
    parentID    Integer       Null,
    largeData1  nVarChar(max) Null,
    largeData2  nVarChar(max) Null);
    -- largeData3...n also exist

Insert Into [example] (id, parentID, largeData1, largeData2)
Select 1, null, 'blah blah blah', null          Union
Select 2,    1, null,             null          Union
Select 3,    1, 'foo bar foobar', null          Union
Select 4,    3, null,             'lorem ipsum' Union
Select 5,    4, null,             null;

此数据的层次结构图:

Hierarchy diagram

我想编写一个查询,它将返回任何给定[id]值的单行。该行应包含该行的[id]和[parentID]信息。它还应包含[largeData1 ... n]字段。但是,如果largeData字段为null,则它应该遍历层次结构,直到遇到该字段的非null值。简而言之,它应该像coalesce函数一样运行,除了行的层次结构而不是一组列。

示例:

[ID] = 1:

id:          1
parentID:    null
largeData1:  blah blah blah
largeData2:  null

[ID] = 2

id:          1
parentID:    1
largeData1:  blah blah blah
largeData2:  null

[ID] = 3

id:          3
parentID:    1
largeData1:  foo bar foobar
largeData2:  null

其中[id] = 4

id:          4
parentID:    3
largeData1:  foo bar foobar
largeData2:  lorem ipsum

[ID] = 5

id:          5
parentID:    4
largeData1:  foo bar foobar
largeData2:  lorem ipsum

到目前为止,我有这个:

Declare @id Integer; Set @id = 5;

With heirarchy
    (id, parentID, largeData1, largeData2, [level])
As (
    Select id, parentID, largeData1,
           largeData2, 1 As [level]
    From example
    Where id = @id

    Union All

    Select parent.id, parent.parentID,
           parent.largeData1,
           parent.largeData2,
           child.[level] + 1 As [level]
    From example As parent
    Inner Join heirarchy As child
        On parent.id = child.parentID)

Select id, parentID,
   (Select top 1 largeData1
    From heirarchy
    Where largeData1 Is Not Null
    Order By [level] Asc) As largeData1,

   (Select top 1 largeData2
    From heirarchy
    Where largeData2 Is Not Null
    Order By [level] Asc) As largeData2

From example
Where [id] = @id;

这会返回我要查找的结果。但是,根据查询计划,它会为我撤回的每个largeData字段单独传递层次结构。

如何提高效率?

这显然是更复杂问题的简化版本。最终查询将以XML格式返回数据,因此涉及FOR XML子句的任何解决方案都非常好。

我可以为此创建一个CLR聚合函数,如果这样做会有所帮助。我还没有探索过这条路线。

1 个答案:

答案 0 :(得分:6)

我想出了这个:

DECLARE @Id  int

SET @Id = 5


;WITH cte (Id, ParentId, SaveParentId, LargeData1, LargeData2)
 as (--  The "anchor", your target Id
     select
        ex.Id
       ,ex.ParentId
       ,ex.ParentId  SaveParentId  --  Not changed throughout the CTE
       ,ex.LargeData1
       ,ex.LargeData2
      from Example ex
      where ex.Id = @Id
     union all select
                 cte.Id
                ,ex.ParentId
                ,cte.SaveParentId  --  Not changed throughout the CTE
                 --  These next are only "reset" if they are null and a not-null
                 --  value was found at this level 
                ,isnull(ex.LargeData1, cte.LargeData2)  
                ,isnull(ex.LargeData2, cte.LargeData2)
      from Example ex
       inner join cte
        on cte.ParentId = ex.Id)
 select
   Id
  ,SaveParentId     ParentId
  ,max(LargeData1)  LargeData1
  ,max(LargeData2)  LargeData2
 from cte
 group by Id, SaveParentId

基本上,从目标节点开始,然后向上走树,在找到它们时将空列替换为非空值。

(抱歉,我周末不做XML。)