Question

我有data table 24 columns（12-2列组）。数据的放置到处都是，我需要将数据压缩成清晰的Dimension表。这是它的外观

Business Key | Group1 | Group2 | Group3 | Group8 | Group11 | Group12
00001        | abc123 |        | efg456 | ght456 |         |

我需要它看起来像

Business Key | Group1 | Group2 | Group3 | Group8 | Group11 | Group12
00001        | abc123 | efg456 | ght456 |        |         |

我已经尝试了Coalescing数据，但是一旦有空白，它会重复后面几列中的列。我有一种感觉，我需要临时表数据并通过连接循环它，但我似乎无法正确。

我很确定我能用12 joins做到这一点。但是必须有一个更优雅的解决方案，因为我有超过8000万条记录需要查看。

澄清：我做了以下

Group1 = Coalesce(group1,group2,group3,...,group11,group12)
Group2 = Coalesce(group2,group3,...,group11,group12)
Group3 = Coalesce(group3,...,group11,group12)

...等 Coalesce适用于第一个差距，但它一直在移动所有东西，因为它不知道它之前已经移动了列中的数据。

Answer 1

coalesce()的逻辑有点复杂：

select coalesce(group1, group2, group3, . . . ) as group1,
       (case when group1 is not null then coalesce(group2, group3, group4, . . .)
             when group2 is not null then coalesce(group3, group4, group5, . . )
        . . .
        end) as group2,
       . . .

正如你所看到的，这真的很复杂。我想知道以下是否会有可观的表现：

select dt.businesskey, p.*
from datatable dt cross apply
     (select max(case when seqnum = 1 then grp end) as grp1,
             max(case when seqnum = 2 then grp end) as grp2,
             . . .
      from (select grp, row_number() over (order by num) as seqnum
            from (values(dt.group1, 1),
                        (dt.group2, 2),
                        . . .
                 ) v(grp, num)
            where grp is not null
           ) p
     ) p;

这看起来很复杂。但SQL Server在优化apply行内转换方面做得非常好。值得一试的是，这是否适合你。

移动数据 - 需要更清晰的方法

1 个答案: