HIVE:寻找总计

时间:2017-09-27 16:14:46

标签: sql hive

我有一个名为Program的表,其中包含以下列:

ProgDate(Date)
Episode(String)
Impression_id(int)
ProgName(String)

我想找出每个日期和剧集的总展示次数,我有以下查询,这是正常工作

Select progdate, episode, count(distinct impression_id) Impression from Program where progname='BBC' group by progdate, episode order by progdate, episode;
Result:
ProgDate        Episode     Impression      
20160919        1       5           
20160920        1       15          
20160921        1       10          
20160922        1       5           
20160923        2       25          
20160924        2       10          
20160925        2       25          

但我也想知道每集的累计总数。我尝试搜索如何查找运行总计,但它正在累计所有以前的总计。我希望每集都有总计,如下所示:

Date        Episode     Impression  CumulativeImpressionsPerChannel     
20160919        1       5               5
20160920        1       15              20
20160921        1       10              30
20160922        1       5               35
20160923        2       25              25
20160924        2       10              35
20160925        2       25              60

1 个答案:

答案 0 :(得分:1)

最新版本的Hive HQL支持窗口分析函数(ref 1)(ref 2),包括SUM()OVER()

假设您有这样的版本,我在SQL Fiddle使用PostgreSQL模仿了语法

CREATE TABLE d
    (ProgDate int, Episode int, Impression int)
;

INSERT INTO d
    (ProgDate, Episode, Impression)
VALUES
    (20160919, 1, 5),
    (20160920, 1, 15),
    (20160921, 1, 10),
    (20160922, 1, 5),
    (20160923, 2, 25),
    (20160924, 2, 10),
    (20160925, 2, 25)
;

查询1

select
      ProgDate, Episode, Impression
    , sum(Impression) over(partition by Episode order by ProgDate) CumImpsPerChannel 
    , sum(Impression) over(order by ProgDate) CumOverall
from (
       Select progdate, episode, count(distinct impression_id) Impression 
       from Program 
       where progname='BBC' 
       group by progdate, episode order by progdate, episode
      ) d

<强> Results

| progdate | episode | impression | cumimpsperchannel |
|----------|---------|------------|-------------------|
| 20160919 |       1 |          5 |                 5 |
| 20160920 |       1 |         15 |                20 |
| 20160921 |       1 |         10 |                30 |
| 20160922 |       1 |          5 |                35 |
| 20160923 |       2 |         25 |                25 |
| 20160924 |       2 |         10 |                35 |
| 20160925 |       2 |         25 |                60 |