基于来自另一列的最大值的新列重复值

时间:2019-02-20 17:11:00

标签: hive hiveql

我有一个表“ mytable”,其结果类似于以下内容

currenttime             racetype            raceid
2018-01-01 03:15:00     gold                22
2018-01-01 04:15:00     silver              22
2019-01-01 04:15:00     bronze              22
2017-01-02 11:44:00     platinum            22

我正在尝试基于最大当前时间创建另一列。它应该从当前的最大时间中获取racetype的值,并对新列中的所有项目重复该条目,如下所示:

currenttime             racetype          raceid     besttype
2018-01-01 03:15:00     gold              22         bronze
2018-01-01 04:15:00     silver            22         bronze
2019-01-01 04:15:00     bronze            22         bronze
2017-01-02 11:44:00     platinum          22         bronze

如果还有其他种族ID,应该对那些种族做同样的事情  

currenttime             racetype          raceid     besttype
2018-01-01 03:15:00     gold              22         bronze
2018-01-01 04:15:00     silver            22         bronze
2019-01-01 04:15:00     bronze            22         bronze
2017-01-02 11:44:00     platinum          22         bronze
2011-01-01 03:15:00     gold              09         silver
2022-01-01 04:15:00     silver            09         silver
2002-01-01 04:15:00     bronze            09         silver

当前我有一个查询

select mt.raceid, tt.racetype, MAX(tt.currenttime) 
OVER (PARTITION by mt.raceid) 
from mytable mt 
join tabletwo tt on mt.id = tt.id
where mt.raceid = 22

此查询没有给出预期的输出结果

raceid         racetype         col0
22             gold             2019-01-01 04:15:00 
22             silver           2019-01-01 04:15:00 
22             platinum          2019-01-01 04:15:00 
22             bronze           2019-01-01 04:15:00 

如何获得第二个和第三个示例中显示的上述预期结果?

1 个答案:

答案 0 :(得分:1)

使用first_value分析功能:

select currenttime, racetype, raceid,
       first_value(racetype) over(partition by raceid order by currenttime desc) as besttype
  from mytable

last_value

select currenttime, racetype, raceid,
           last_value(racetype) over(partition by raceid order by currenttime) as besttype
  from mytable