如何在SQL中根据上个月的平均值和去年同月的平均值计算值

时间:2021-05-30 12:23:42

标签: sql exasolution exasol

我想根据上个月和去年同月的实际情况计算打开率和点击率的目标。 我的表是按日汇总的,我按月和年对其进行了分组以获得月平均值。然后,我创建了一个自联接,以根据前几个月的结果加入我的当前日期。这适用于除 1 月以外的所有月份,因为 SQL 不知道它应该在 12 上加入 1。有没有办法在我的 join 子句中指定这一点?

基本上,2021 年 1 月的结果不应该为空,因为我有 2020 年 12 月的数据。

这是我的数据和查询:

CREATE TABLE exasol_last_year_avg(
    date_col date,
    country text,
    brand text,
    category text,
    delivered integer,
    opened integer,
    clicked integer
)

INSERT INTO exasol_last_year_avg 
(date_col,country,brand,category,delivered,opened,clicked) VALUES
(2021-01-01,'AT','brand1','cat1',100,60,23),
(2021-01-01,'AT','brand1','cat2',200,50,45),
(2021-01-01,'AT','brand2','cat1',300,49,35),
(2021-01-01,'AT','brand2','cat2',400,79,57),
(2021-02-02,'AT','brand1','cat1',130,78,30),
(2021-02-02,'AT','brand1','cat2',260,65,59),
(2021-02-02,'AT','brand2','cat1',390,64,46),
(2021-02-02,'AT','brand2','cat2',520,103,74),
(2020-12-02,'AT','brand1','cat1',130,78,30),
(2020-12-02,'AT','brand1','cat2',260,65,59),
(2020-12-02,'AT','brand2','cat1',390,64,46),
(2020-12-02,'AT','brand2','cat2',520,103,74),
(2020-02-02,'AT','brand1','cat2',236,59,53),
(2020-02-02,'AT','brand2','cat1',355,58,41),
(2020-02-02,'AT','brand2','cat2',473,93,67),
(2020-02-02,'AT','brand1','cat1',118,71,27)

这是用 PostgresSQL 编写的,因为我认为它对大多数人来说更容易访问,但我的生产数据库是 Exasol!

select *
from
(Select month_col,
           year_col,
           t_campaign_cmcategory,
           t_country,
           t_brand,
           (t2_clicktoopenrate + t3_clicktoopenrate)/2 as target_clicktoopenrate,
           (t2_openrate + t3_openrate)/2 as target_openrate
    from (
with CTE as (
select extract(month from date_col) as month_col,
       extract(year from date_col) as year_col, 
        category as t_campaign_cmcategory,
        country as t_country,
        brand as t_brand,
        round(sum(opened)/nullif(sum(delivered),0),3) as OpenRate,
        round(sum(clicked)/nullif(sum(opened),0),3) as ClickToOpenRate
from public.exasol_last_year_avg
group by 1, 2, 3, 4, 5)
select t1.month_col,
           t1.year_col,
           t2.month_col as t2_month_col,
           t2.year_col as t2_year_col,
           t3.month_col as t3_month_col,
           t3.year_col as t3_year_col,
           t1.t_campaign_cmcategory,
           t1.t_country,
           t1.t_brand,
           t1.OpenRate,
           t1.ClickToOpenRate,
           t2.OpenRate as t2_OpenRate,
           t2.ClickToOpenRate as t2_ClickToOpenRate,
           t3.OpenRate as t3_OpenRate,
           t3.ClickToOpenRate as t3_ClickToOpenRate
    from CTE t1
    left join CTE t2
    on t1.month_col = t2.month_col + 1
    and t1.year_col = t2.year_col
    and t1.t_campaign_cmcategory = t2.t_campaign_cmcategory
    and t1.t_country = t2.t_country
    and t1.t_brand = t2.t_brand
    left join CTE t3
    on t1.month_col = t3.month_col
    and t1.year_col = t3.year_col + 1
    and t1.t_campaign_cmcategory = t3.t_campaign_cmcategory
    and t1.t_country = t3.t_country
    and t1.t_brand = t3.t_brand) as target_base) as final_tbl

2 个答案:

答案 0 :(得分:0)

从聚合查询开始:

select date_trunc('month', date_col), country, brand, 
       sum(opened) * 1.0 / nullif(sum(delivered), 0) as OpenRate,
       sum(clicked) * 1.0 / nullif(sum(opened), 0) as ClickToOpenRate
from exasol_last_year_avg
group by 1, 2, 3;

然后,使用窗口函数。假设您每个月都有一个值(没有间隙)。你可以只使用 lag()。我不确定您的最终计算结果是什么,但这会带来数据:

with mcb as (
      select date_trunc('month', date_col) as yyyymm, country, brand, 
             sum(opened) * 1.0 / nullif(sum(delivered), 0) as OpenRate,
             sum(clicked) * 1.0 / nullif(sum(opened), 0) as ClickToOpenRate
      from exasol_last_year_avg
      group by 1, 2, 3
     )
select mcb.*,
       lag(openrate, 1) over (partition by country, brand order by yyyymm) as prev_month_openrate,
       lag(ClickToOpenRate, 1) over (partition by country, brand order by yyyymm) as prev_month_ClickToOpenRate,
       lag(openrate, 12) over (partition by country, brand order by yyyymm) as prev_year_openrate,
       lag(ClickToOpenRate, 12) over (partition by country, brand order by yyyymm) as prev_year_ClickToOpenRate
from mcb;

答案 1 :(得分:0)

这适用于不同的连接条件:

select *
from
(Select month_col,
           year_col,
           t_campaign_cmcategory,
           t_country,
           t_brand,
           (t2_clicktoopenrate + t3_clicktoopenrate)/2 as target_clicktoopenrate,
           (t2_openrate + t3_openrate)/2 as target_openrate
    from (
with CTE as (
select extract(month from date_col) as month_col,
       extract(year from date_col) as year_col, 
        category as t_campaign_cmcategory,
        country as t_country,
        brand as t_brand,
        round(sum(opened)/nullif(sum(delivered),0),3) as OpenRate,
        round(sum(clicked)/nullif(sum(opened),0),3) as ClickToOpenRate
from public.exasol_last_year_avg
group by 1, 2, 3, 4, 5)
select t1.month_col,
           t1.year_col,
           t2.month_col as t2_month_col,
           t2.year_col as t2_year_col,
           t3.month_col as t3_month_col,
           t3.year_col as t3_year_col,
           t1.t_campaign_cmcategory,
           t1.t_country,
           t1.t_brand,
           t1.OpenRate,
           t1.ClickToOpenRate,
           t2.OpenRate as t2_OpenRate,
           t2.ClickToOpenRate as t2_ClickToOpenRate,
           t3.OpenRate as t3_OpenRate,
           t3.ClickToOpenRate as t3_ClickToOpenRate
    from CTE t1
    left join CTE t2
-- adjusted join condition
    on ((t1.month_col = (CASE WHEN t1.month_col = 1 then t2.month_col - 11 END) and t1.year_col = t2.year_col + 1)
or (t1.month_col = (CASE WHEN t1.month_col != 1 then t2.month_col + 1 END) and t1.year_col = t2.year_col))
    and t1.t_campaign_cmcategory = t2.t_campaign_cmcategory
    and t1.t_country = t2.t_country
    and t1.t_brand = t2.t_brand
    left join CTE t3
    on t1.month_col = t3.month_col
    and t1.year_col = t3.year_col + 1
    and t1.t_campaign_cmcategory = t3.t_campaign_cmcategory
    and t1.t_country = t3.t_country
    and t1.t_brand = t3.t_brand) as target_base) as final_tbl