Impala - 不使用UNION ALL将列转换为行

时间:2017-01-31 18:58:36

标签: sql impala

我有一个包含用户的表和每周的任一活动,例如这是user_activity表:

userid | wk1 | wk2 | wk3
u1     |  1  |  0  |  1
u2     |  0  |  1  |  0
u3     |  1  |  0  |  1

我想将其转换为:

week   | active
wk1    |   2
wk2    |   1
wk3    |   2

我可以使用UNION ALL实现这一点:

SELECT 'wk1' as week,
       SUM( wk1 ) AS active
FROM user_activity
UNION ALL
SELECT 'wk2' as week,
       SUM( wk2 ) AS active
FROM user_activity
UNION ALL
SELECT 'wk3' as week,
       SUM( wk3 ) AS active
FROM user_activity;

有没有办法在不使用UNION ALL的情况下实现相同目的?

谢谢!

编辑: Impala版本:2.6.0

没有UNION ALL的原因:对于每个SELECT,将从HDFS扫描整个表。如果我有一个巨大的表,这将导致OOM错误。

4 个答案:

答案 0 :(得分:3)

这可以解决您的表现问题 该表仅扫描一次 记录没有重复X3 UNION ALL仅用于单个记录。

select  concat('wk',cast(c.i as string))    as week
       ,case c.i 
            when 1 then wk1 
            when 2 then wk2
            when 3 then wk3
        end                 as active

from   (select  sum(wk1) AS wk1
               ,sum(wk2) AS wk2
               ,sum(wk3) AS wk3 

        from    user_activity
        ) t

        cross join (            select 1 as i
                    union all   select 2 
                    union all   select 3
                    ) c
;                    
+------+--------+
| week | active |
+------+--------+
| wk1  | 2      |
| wk2  | 1      |
| wk3  | 2      |
+------+--------+

答案 1 :(得分:1)

忽略琐碎的“使用UNION”,问题似乎有点荒谬。但这是一种方法:

with nounionall as (
      select (case row_number() over (order by userid)
                  when 1 then 'wk1'
                  when 2 then 'wk2'
                  when 3 then 'wk3'
              end) as week
      from user_activity ua
      limit 3
     )
select nounionall.week,
       sum(case when nounionall.week = 'wk1' then wk1
                when nounionall.week = 'wk2' then wk2
                when nounionall.week = 'wk3' then wk3
           end) as actives
from nounionall cross join
     user_activity ua
group by nounionall.week

答案 2 :(得分:0)

您可以尝试取消隐藏和聚合。这样,user_activity表将只读一次。

select
    w.week,
    sum(case w.week 
        when 'wk1' then wk1
        when 'wk2' then wk2
        when 'wk3' then wk3
    end) active
from user_activity u
cross join (
    select 'wk1' week union all
    select 'wk2' week union all
    select 'wk3' week 
) w group by w.week;

产地:

+------+--------+
| week | active |
+------+--------+
| wk1  | 2      |
| wk2  | 1      |
| wk3  | 2      |
+------+--------+

它只需要一个聚合而不是三个。我只使用UNION ALL来构建自定义数据透视查询。我没有在用户桌上使用它。

答案 3 :(得分:0)

如果没有任何联合,这个非常简单的解决方案怎么样: -

SELECT [week],active
FROM (
    SELECT SUM(wk1)wk1,SUM(wk2)wk2,SUM(wk3)wk3
    FROM user_activity)pvt
UNPIVOT ([active] FOR [Week] IN  (wk1,wk2,wk3)) unpvt