如何在Hive中的GROUPING SETS后重塑数据?

时间:2017-05-13 17:14:01

标签: hive hiveql cube rollup grouping-sets

我想在许多不同的维度上聚合一列。我认为GOUPING SETS适合我的问题,但我无法弄清楚如何从GROUPING SETS转换/重塑结果表。

这是我使用GROUPING SETS的查询:

SELECT "value" FROM "temp"

查询将产生如下表:

select date, dim1, dim2, dim3, sum(value) as sum_value
from table
grouping by date, dim1, dim2, dim3
grouping sets ((date, dim1), (date, dim2), (date, dim3))

但我真正需要的是这样一张桌子:

date, dim1, dim2, dim3, sum_value
2017-01-01, A, NULL, NULL, [value_A]
2017-01-01, B, NULL, NULL, [value_B]
2017-01-01, NULL, C, NULL, [value_C]
2017-01-01, NULL, D, NULL, [value_D]
2017-01-01, NULL, NULL, E, [value_E]
2017-01-01, NULL, NULL, F, [value_F]

维度的实际数量远远超过3,因此对查询进行硬编码不是一个好主意。有没有办法从分组集或其他聚合方法中重新整形表以获得所需的表?

谢谢!

1 个答案:

答案 0 :(得分:2)

select    `date`
         ,elt(log2(GROUPING__ID - 1),'dim1','dim2','dim3')      as dim
         ,coalesce (dim1,dim2,dim3)                             as factor
         ,sum(value)                                            as sum_value

from      `table`

group by  `date`,dim1,dim2,dim3
          grouping sets ((`date`,dim1),(`date`,dim2),(`date`,dim3))