通过计算特定字段的平均值来Hive查询组

时间:2014-05-23 07:35:15

标签: hadoop hive

我们有以下数据,但我们无法为此编写HIVE查询。

   CUSTOMER_NAME PRODUCT_NAME PRICE OCCURANCE ID
   customer1,    product1,    20,       1
   customer1,    product2,    30,       2
   customer1,    product1,    25,       3
   customer1,    product1,    20,       1
   customer1,    product2,    20,       2
   customer1,    product2,    30,       2

现在我们期待低于产量。

基本上我们想要客户和产品的平均价格。

     First we need to average price at customer1,product1,occuranceID level.        

e.g. customer1,product1,20 (AVG is 20 for occurance 1), 1
     customer1,product1,25 (AVG is 25 for occurance 3) , 3

      Now once again we have to average based on count of occurance (we will remove occurance ID here)

      output will be given blow.

      customer1,product1,20+25/2


      How to write HIVE query for this ?

0 个答案:

没有答案