在分组列Hive上操作

时间:2016-08-30 12:05:27

标签: sql hadoop hive

在Hive表中,我有实际的销售和预测。所以数据看起来像:

item   date  salesDol   salesUnit   predictionU
1    1/1/2016  5.99      1            0.9
1    1/1/2016  5.49      1            0.9
1    2/1/2016  5.99      1            0.84
1    3/1/2016  6.04      1            0.92

为计算我的平均价格:

create table data1 as 
select item, date, predictionU from data
JOIN
(select sum(salesDol) as totDol, sum(salesUnit) as totUnit from data);

因此,在每行中我都有totDoltotUnit。现在,为了获得最终推断的销售单位,我尝试:

create table data2 as 
    select item, date, sum(predictionU)*totDol/totUnit from data1 group by item, date;

然后我收到错误说:

  

FAILED:SemanticException [错误10025]:表达式不在   GROUP BY键' totDol'

我无法理解为什么Hive要求我在group by子句中包含totDol。任何建议。

1 个答案:

答案 0 :(得分:2)

只需使用窗口功能:

select item, date, predictionU,
       sum(salesDol) over () as totDol,
       sum(salesUnit) over () as totUnit
from data;

然后,您可以在最终查询中包含此内容:

select item, date, predictionU,
       sum(salesDol) over () as totDol,
       sum(salesUnit) over () as totUnit,
       (preditionU * sum(salesDol) over () / sum(salesUnit) over ())
from data;