HiveQL:选择与另一列的Max(值)配对的列的值

时间:2018-05-17 09:21:12

标签: hive hiveql

让我们假设我有这张表

Date         Department   Value
2017-01-02    A            30
2017-01-02    B            60
2017-01-02    C            10
2017-01-02    D            40
2017-01-03    C            20
2017-01-03    D            150
2017-01-03    E            100
2017-01-03    F            20
...

我想让每天都有更高'价值'的部门

哪会导致

Date          Department   Value
2017-01-02    B            60
2017-01-03    D            150

我怎么能实现这个目标?

3 个答案:

答案 0 :(得分:1)

分解两部分的问题首先按日期获取最大值,即CTE

将结果集与基表连接并获得所需的结果

  with  temp as (
          select Date ,max(value) as value
          from tableName group by Date
  )
  selct a.Date  ,  a.Department ,a.Value 
      from tableName a join temp b
      on a.Date=b.Date 
      and a.value=b.value

答案 1 :(得分:1)

您的基础数据:

hive> create table tx1(date1 date,department string,value int) row format delimited fields terminated by ',';
OK
Time taken: 1.172 seconds
hive> load data local inpath '/home/vivekanand/vivek/hive/test.dat' into table tx1;
Loading data to table default.tx1
OK
Time taken: 0.727 seconds
hive> select * from tx1;
OK
2017-01-02  A   30
2017-01-02  B   60
2017-01-02  C   10
2017-01-02  D   40
2017-01-03  C   20
2017-01-03  D   150
2017-01-03  E   100
2017-01-03  F   20
Time taken: 1.89 seconds, Fetched: 8 row(s)

您可以使用以下分析功能:

select date1,department,value
from(select date1,department,value,rank() over(partition by date1 order by value desc) f from tx1) k
where f=1;

输出:

Total MapReduce CPU Time Spent: 0 msec
OK
2017-01-02  B   60
2017-01-03  D   150
Time taken: 1.569 seconds, Fetched: 2 row(s)

答案 2 :(得分:1)

使用rank()分析函数。 rank()会将1分配给每天价值较高的行。

select Date, Department, Value
from
(
select a.Date, a.Department, a.Value,
       rank() over(partition by a.Date order by a.Value desc) as rnk
  from tableName a
)s
where rnk=1
;