如何将多个列值转换为蜂巢中的行?

时间:2020-02-04 11:09:16

标签: hadoop hive

Input:

ID    COLUMN1     COLUMN2             COLUMN3
1     M,S,E,T     1,2,3,4             5,6,7
2     A,B,C       6,5,8,7,9,1         2,4,3,0,1


Output:

ID      COLUMN1      COLUMN2      COLUMN3
1        M           10           50
1        S           20           60
1        E           30           70
1        T           40           NULL
2        A           6            2
2        B           5            4
2        C           8            3 
2        NULL        7            0
2        NULL        9            1
2        NULL        1            NULL

代码:

select ID,
array_index( COLUMN1_arr, n ) as COLUMN1,
array_index( COLUMN2_arr, n ) as COLUMN2
from sample
lateral view numeric_range(size(COLUMN1_arr)) n1 as n;

错误:

失败:语义异常[错误10011]:无效的函数array_index

这里我在单列中有多个值,我需要将其转换为行,如输出所述。

2 个答案:

答案 0 :(得分:0)

分解是蜂巢中提供的UDTF,您可以使用它将数据从列拆分为行。

git config --global http.proxy http://DOMAIN\username@proxy-server.domain:8080 

答案 1 :(得分:0)

普通香草蜂巢解决方案,无砖房UDF。

演示:

with 

input as ( ---------------Input dataset
select stack(2,
1, array('M','S','E','T'), array(1,2,3,4),     array(5,6,7),
2, array('A','B','C'),     array(6,5,8,7,9,1), array( 2,4,3,0,1) 
) as (ID,COLUMN1,COLUMN2,COLUMN3)
),

--explode each array and FULL join them
c1 as (
select i.id, v.column1, p
  from input i
       lateral view posexplode(i.COLUMN1) v as p,column1
),

c2 as (
select i.id, v.column2, p
  from input i
       lateral view posexplode(i.COLUMN2) v as p,column2
),

c3 as (
select i.id, v.column3, p
  from input i
       lateral view posexplode(i.COLUMN3) v as p,column3
)

--FULL JOIN
select coalesce(c1.id,c2.id,c3.id) id, c1.column1, c2.column2, c3.column3
  from c1 
       full join c2 on c1.id=c2.id and c1.p=c2.p
       full join c3 on nvl(c1.id,c2.id)=c3.id and nvl(c1.p,c2.p)=c3.p --note NVL usage
;   

结果:

OK
id      column1 column2 column3
1       M       1       5
1       S       2       6
1       E       3       7
1       T       4       NULL
2       A       6       2
2       B       5       4
2       C       8       3
2       NULL    7       0
2       NULL    9       1
2       NULL    1       NULL