如何通过查询获取最大分组

时间:2019-08-29 20:58:42

标签: hive cqlsh

这是我的数据集:

00000000040112    2702      00000000040112  AVAILABLE       1566921227223   -6.0    LB
00000000040112    2702      00000000040112  AVAILABLE       1566921247222   -9.0    LB
00030400791888    6065      00030400791888  AVAILABLE       1566919357992   45.0    EA
00030400791888    6065      00030400791888  AVAILABLE       1566919547809   72.0    EA 

我正在尝试从每个组中获取最大值,因此根据上述数据,预期结果将是这样的:

00000000040112  2702    00000000040112  AVAILABLE       1566921247222   -9.0    LB 
00030400791888  6065    00030400791888  AVAILABLE       1566919547809   72.0    EA

我无法产生正确结果的查询是:

select  
  primegtin, nodeid, gtin, inventory_state, 
  max(last_updated_time), 
  quantity_by_gtin, quantity_uom 
from pit_by_prime_gtin 
where 
  year=2019 and month =8 and day =27 and hour=15 
group by 
  primegtin, nodeid, gtin, inventory_state, 
  last_updated_time, 
  quantity_by_gtin, quantity_uom ;

这可能是什么问题?

1 个答案:

答案 0 :(得分:0)

您需要删除通过group by子句进行汇总的列。

在您的示例中,它可能应该类似于:

select  
  primegtin, nodeid, gtin, inventory_state, 
  max(last_updated_time), 
  max(quantity_by_gtin), quantity_uom 
from pit_by_prime_gtin 
where 
  year=2019 and month =8 and day =27 and hour=15 
group by 
  primegtin, nodeid, gtin, inventory_state, 
  quantity_uom ;