Apache Hive:每组一行

时间:2018-10-03 12:31:35

标签: sql apache hive

我有一个像这样的apive Hive表:

id          llvc    lp
2428766324  P005    P048
2428766324  P005    P024
2428766324  P005    NULL
2429788401  P005    P024
2429788401  P005    NULL
2429788401  P005    P048
2457843473  P005    P024
2457843473  P005    P048
2457843473  P005    NULL
2457872560  P005    NULL
2457872560  P005    P048
2457872560  P005    P024

对于每个id,我都有一个/多个lign,我想按照以下条件对每个id取一行:

每组ID

If number of line = 1, take this line    
if number of line > 1, take the line where llvc = lp    
if number of line > 1, and no llvc == lp, take the line where lp = null

并放弃组中的其他行。

例如:

id          llvc    lp     
2428766324  P005    P048      
2428766324  P005    P024      
2428766324  P005    NULL 

我想使用 2428766324 P005 NULL

1 个答案:

答案 0 :(得分:2)

使用row_number()

select *
from (select t.*,
             row_number() over (partition by id
                                order by (case when llvc = lp then 1 
                                               when lp is null then 2
                                               else 3
                                          end)
                                ) as seqnum
     ) t
where seqnum = 1;