多列中的“配置单元”最大列值

时间:2018-09-06 14:25:51

标签: hadoop hive hiveql

您好:我遇到一种情况,我需要在3个计算字段中找到最大值并将其存储在另一个字段中,是否可以在一个SQL查询中完成?下面是示例

SELECT Income1 ,
       Income1 * 2%  as Personal_Income ,
       Income2 ,
       Income2 * 10% as Share_Income ,
       Income3 ,
       Income3 * 1%  as Job_Income , 
       Max(Personal_Income, Share_Income, Job_Income ) 
  From Table

我尝试的一种方法是在第一遍和第二遍中计算Personal_Income, Share_Income, Job_Income

Select 
      Case when Personal_income > Share_Income and Personal_Income > Job_Income 
                then Personal_income 
           when Share_income > Job_Income 
                then Share_income 
           Else Job_income as the greatest_income

但是这需要我对十亿行的表进行2次扫描,如何避免这种情况并一次完成呢?任何帮助表示赞赏。

1 个答案:

答案 0 :(得分:0)

从Hive 1.1.0开始,您可以使用greatest()函数。该查询将在单个表扫描中完成:

select Income1 ,
       Personal_Income ,
       Income2 ,
       Share_Income ,
       Income3 ,
       Job_Income ,
       greatest(Personal_Income, Share_Income, Job_Income ) as greatest_income
from
(
SELECT Income1 ,
       Income1 * 2%  as Personal_Income ,
       Income2 ,
       Income2 * 10% as Share_Income ,
       Income3 ,
       Income3 * 1%  as Job_Income , 
  From Table
)s
;