Hive中的SQL查询获取2列中的前2个值

时间:2015-10-29 17:34:08

标签: mysql sql hive ranking

我有一个像access(url,access_time)这样的表,每个网址可能有很多访问时间。

我有另一个资产表(url,foo)

我想做一个查询,将其转换为joined_data(url,first_access_time,second_access_time)

如果没有访问时间,first_access_time为NULL,如果没有第二访问时间,则second_access_time为NULL。

我怎么能在蜂巢中这样做?

1 个答案:

答案 0 :(得分:1)

您可以使用row_number执行此操作。

with twotimes as (select ast.url, a.access_time,
                  row_number() over(partition by a.url order by a.access_time) as rn
                  from asset ast 
                  left join access a on a.url = ast.url )
select url, max(first_access_time), max(second_access_time)
from (
select url, access_time as first_access_time, null as second_access_time
from twotimes where rn = 1
union all
select url, null as first_access_time, access_time as second_access_time
from twotimes where rn = 2
) t
group by url