我有一个像这样的apive Hive表:
id llvc lp
2428766324 P005 P048
2428766324 P005 P024
2428766324 P005 NULL
2429788401 P005 P024
2429788401 P005 NULL
2429788401 P005 P048
2457843473 P005 P024
2457843473 P005 P048
2457843473 P005 NULL
2457872560 P005 NULL
2457872560 P005 P048
2457872560 P005 P024
对于每个id,我都有一个/多个lign,我想按照以下条件对每个id取一行:
每组ID
If number of line = 1, take this line
if number of line > 1, take the line where llvc = lp
if number of line > 1, and no llvc == lp, take the line where lp = null
并放弃组中的其他行。
例如:
id llvc lp
2428766324 P005 P048
2428766324 P005 P024
2428766324 P005 NULL
我想使用 2428766324 P005 NULL
答案 0 :(得分:2)
使用row_number()
:
select *
from (select t.*,
row_number() over (partition by id
order by (case when llvc = lp then 1
when lp is null then 2
else 3
end)
) as seqnum
) t
where seqnum = 1;