sql根据多列选择优先级

时间:2016-02-22 00:33:36

标签: sql hive

以下是一些示例数据。我正在尝试为最近的活动日期为每个UserID获取一条记录。如果用户在给定日期观看了多部电影,则应根据与电影名称相关联的优先级选择录制

UserID MovieName ActivityDate
1       MOV1    2015-02-12
2       MOV2    2015-04-22
1       MOV3    2015-03-16
3       MOV1    2015-06-23
2       MOV5    2016-01-01
2       MOVH    2016-01-01

与电影名称相关的优先级 -

MOV1 > MOV2 > MOV3 > MOV5 > MOVH

预期结果:

UserID MovieName ActivityDate
1       MOV3    2015-03-16
2       MOV5    2016-01-01
3       MOV1    2015-06-23

我尝试过GROUP BY和CASE的组合,但我很确定可能有更好的方法。任何帮助表示赞赏。

2 个答案:

答案 0 :(得分:2)

select *
from (
select *
, row_number() OVER (partition by id order by MovieName desc, ActivityDate desc) as rnk
from movies) m
where m.rnk = 1

答案 1 :(得分:0)

正确的答案是row_number(),但您需要注意order by

select m.*
from (select m.*,
             row_number() over (partition by UserId
                                order by ActivityDate desc,
                                         (case MovieName
                                               when 'MOV1' then 1
                                               when 'MOV2' then 2
                                               when 'MOV3' then 3
                                               when 'MOV5' then 4
                                               when 'MOVH' then 5
                                               else 999
                                          end)
                               ) as seqnum

      from movies m
     ) m
where seqnum = 1;