重新编写SQL窗口函数(在partition()上排名)而不使用窗口函数

时间:2014-03-15 23:00:31

标签: sql hadoop hive hiveql

在hive 12.0中引入了窗口函数,但是,我无法使用更新版本的Hive。是否有可能在HiveQL中重写/表达以下SQL的配置单元?

 select userid, siteid, eventdate, 
 count(*) over(partition by userid, siteid order by eventdate) as c, 
 rank() over (partition by userid, siteid order by eventdate) as rank
 from views 

我当前版本的hive(10.0)不支持窗口函数,并且不支持from子句中的子查询,即

 select userid, clicks from clicktable where userid in usertable

1 个答案:

答案 0 :(得分:0)

写这个的自然方式是作为相关子查询。但是,您也可以使用连接,聚合和条件聚合函数来执行此操作:

select v.userid, v.siteid, v.eventdate,
       count(*) as c,
       count(distinct case when v2.eventdate <= v.eventdate then v2.eventdate end) as rank
from views v left outer join
     views v2 
     on v.userid = v2.userid and
        v.siteid = v2.siteid
group by v.userid, v.siteid, v.eventdate;