I have data like this:
column1 column2 column3
A V 10
A Z 11
A X 11
And for each data in column1
I want to find the maximum value in column3
and the corresponding value in column2
. How do I do this in HIVE
?
The closest thing I have is to do select column1, max(column3) from table group by column1
. But, this doesn't add the corresponding information from column2
. How do I get that as well?
Upon a tie in column3, I really don't care which value we pull from column2
. Thanks.
I want the result to be:
column1 column2 column3
A Z 11
答案 0 :(得分:1)
一种方法是使用row_number
。在绑定的情况下,您将获得column2的任意值。
select column1,column2,column3
from (
select t.*,row_number() over(partition by column1 order by column3 desc) as rn
from tablename t
) x
where rn=1
答案 1 :(得分:0)
以下是相同概念的一些变体
select column1 ,
,max(named_struct('column3',column3,'column2',column2)).column2
,max(column3) as column3
from mytable
group by column1
;
+---------+---------+---------+
| column1 | column2 | column3 |
+---------+---------+---------+
| A | Z | 11 |
+---------+---------+---------+
select column1
,max(struct(column3,column2)).col2 as column2
,max(column3) as column3
from mytable
group by column1
;
+---------+---------+---------+
| column1 | column2 | column3 |
+---------+---------+---------+
| A | Z | 11 |
+---------+---------+---------+
select column1
,col.column2
,col.column3
from (select column1
,max(named_struct('column3',column3,'column2',column2)) as col
from mytable
group by column1
) t
;
+---------+---------+---------+
| column1 | column2 | column3 |
+---------+---------+---------+
| A | Z | 11 |
+---------+---------+---------+