如何使用hive查询语言从此表计算前12个月的总和值?

时间:2016-12-19 03:35:51

标签: sql hive

month   first_member
201612  135054
201611  250507
201610  296114
201609  317501
201608  427143
201607  449202
201606  398261
201605  419880
201604  393784
201603  459383
.....

这是我的表有两列。我想计算一个月前前12个月的first_member总和。例如,新表的一条记录包括201605和201505到201605之间的first_member之和。那么如何破解查询以创建新表。 月份是字符串类型,first_member是int。

2 个答案:

答案 0 :(得分:1)

使用sum窗口功能。这将为您提供每个月的前12个月的first_member总和。

select month,
sum(first_member) over(order by cast(substr(month,1,4) as int),cast(substr(month,5) as int)
                       rows between 11 preceding and current row) rolling_sum 
from tablename

编辑:根据OP的评论我想要包括201505但不包括201605 ,所需的改变将是

select month,
sum(first_member) over(order by cast(substr(month,1,4) as int),cast(substr(month,5) as int)
                       rows between 12 preceding and 1 preceding) rolling_sum 
from tablename

请注意,这假设first_member在指定的时间范围内每个月都有一个值。

答案 1 :(得分:1)

month格式时,无需拆分YYYYMM列。

select  "month"

       ,sum(first_member) over
        (
            order by  "month"
            rows      between 12 preceding and 1 preceding
        ) as running_total

from    tablename