Hive查询以使用其他列值的总和获得前3个列值

时间:2019-05-03 18:48:21

标签: sql hadoop hive hiveql

我有3列,分别是Key_id,gst,claim_amnt。 单个key_id将具有多个gst值,而单个gst值具有多个Claim_amnt。例如,我有下面的表格。

 Key_id   gst     claim_amnt
  1245     5         235
  1245     5         100
  1245     4         254   
  1245     4         542
  1245     4         471
  1245     2         745
  1246     1         24
  1246     1         45
  1246     5         52
  1246     5         42
  1246     4         45

每个key_id的gst的Claim_amnt的总和

喜欢

key_id  gst   (addition of claim_amnt with respect to its gst)
1245    5     335
1245    4     1267
1245    2     745
1246    1     69
1246    5     94
1246    4     45

我需要前3个与Claim_amnt有关的gst字段以及相应的key_id。

必填输出:

key_id  gst_top1   gst_top2   gst_top3
1245      4          2           5
1246      5          1           4

我可以在单个列值中获取top3 gst值,但无法通过将top3与3个不同的字段分开来实现。

1 个答案:

答案 0 :(得分:0)

您可以通过几个聚合级别来做到这一点:

select key_id,
       max(case when seqnum = 1 then gst end) as gst_1,
       max(case when seqnum = 2 then gst end) as gst_2,
       max(case when seqnum = 3 then gst end) as gst_3
from (select key_id, gst, sum(claim_amt) as claim_amt,
             row_number() over (partition by key_id order by sum(claim_amt) desc) as seqnum
      from t
      group by key_id, gst
     )
group by key_id;