如何使用蜂巢将列值分为不同的列

时间:2019-02-22 06:59:54

标签: hive hiveql

输入:

 name year run
 1. a    2008 4
 2. a    2009 3
 3. a    2008 4
 4. b    2009 8
 5. b    2008 5

配置单元中的输出:

 name 2008 2009
 1. a 8 3
 2. b 5 8

2 个答案:

答案 0 :(得分:0)

固定年份:

select name,
       max(case when year=2008 then run end) as year_2008, 
       max(case when year=2009 then run end) as year_2009, 
       ... and so on
  from my_table
  group by name;

在Hive中不可能动态生成此类列,但是您可以先选择不同的年份,然后使用Shell生成此SQL。

答案 1 :(得分:0)

根据我的理解,您需要每年将某些运行次数转到“年”列中

您需要求和函数,而不是最大值

select
sum(case when year=2008 then run else 0 end) 2008_run,
sum(case when year=2009 then run else 0 end) 2009_run,
from table t1
group by name;

要找出每年排名前5位的得分手。

with table1 as
(
select name, sum(runs) as RunsPerYear, year from myTable group by name, year
)
table2 as
(
select name, year, RunsPerYear, dense_rank() over (partition by name, year order by RunsPerYear) as rnk from table2
)
select name, year, RunsPerYear from table2 where rnk<=5;