计算Redshift中每月每一天的累积总运行额/销售额

时间:2020-06-28 06:11:07

标签: sql amazon-redshift

问题陈述:每种产品在每个日历日的正常销售情况。

简要背景:进行销售时,会将条目添加到销售表中。如果特定产品在特定日期没有任何销售,则不会插入任何记录。

销售表结构。

+------------+--------------+------------+
| date       | Product Code | total_sale |
+------------+--------------+------------+
| 2020-01-15 |        abc   |        100 |
| 2020-01-16 |        abc   |        200 |
| 2020-01-17 |        abc   |        200 |
| 2020-01-16 |        tvc   |        200 |
| 2020-01-16 |        sfr   |        200 |
+------------+--------------+------------+

SQL生成以上视图。


    create temporary table sales_daily as 
    select '20200115' :: date as sales_day, 'abc' as product_Code , 100 as sales
    Union all 
    select '20200116' :: date as sales_day, 'abc' as product_Code , 200 as sales
    Union all 
    select '20200117' :: date as sales_day, 'abc' as product_Code , 200 as sales
    Union all 
    select '20200115' :: date as sales_day, 'tvc' as product_Code , 200 as sales
    Union all 
    select '20200115' :: date as sales_day, 'sfr' as product_Code , 200 as sales
    ;
    
    select * from sales_Daily; 

预期产量:每月的每个日历日(在本例中为2020年1月)获得最近n天的滚动销售(此数字可以是任何数字,并将在最终查询中保留)。

已采取的步骤: 我为此尝试了现有的日历表(下面创建了共享的日历表的片段)和求和窗口功能。但是,由于需要使用产品代码分区而不是日期级别对每个产品代码进行汇总,因此需要计算滚动总和。我了解这是预期的行为。我要问的是使用redshift解决此问题陈述时应采取的方法。另外,这是否是可以使用窗口函数解决的问题陈述。

SQL创建日历表:

create temporary table calendar as 
select '20200101' :: date As calendar_day Union all
select '20200102' :: date As calendar_day Union all
select '20200103' :: date As calendar_day Union all
select '20200104' :: date As calendar_day Union all
select '20200105' :: date As calendar_day Union all
select '20200106' :: date As calendar_day Union all
select '20200107' :: date As calendar_day Union all
select '20200108' :: date As calendar_day Union all
select '20200109' :: date As calendar_day Union all
select '20200110' :: date As calendar_day Union all
select '20200111' :: date As calendar_day Union all
select '20200112' :: date As calendar_day Union all
select '20200113' :: date As calendar_day Union all
select '20200114' :: date As calendar_day Union all
select '20200115' :: date As calendar_day Union all
select '20200116' :: date As calendar_day Union all
select '20200117' :: date As calendar_day Union all
select '20200118' :: date As calendar_day Union all
select '20200119' :: date As calendar_day Union all
select '20200120' :: date As calendar_day Union all
select '20200121' :: date As calendar_day Union all
select '20200122' :: date As calendar_day Union all
select '20200123' :: date As calendar_day Union all
select '20200124' :: date As calendar_day Union all
select '20200125' :: date As calendar_day Union all
select '20200126' :: date As calendar_day Union all
select '20200127' :: date As calendar_day Union all
select '20200128' :: date As calendar_day Union all
select '20200129' :: date As calendar_day Union all
select '20200130' :: date As calendar_day Union all
select '20200131' :: date As calendar_day 
;

到目前为止,SQL用于最终输出:

select 
calendar_Day,
sales_day,
product_Code,
sales,
sum(sales) over (partition by product_Code order by calendar_Day  rows between 1 PRECEDING and current row) running_salest1day
from  calendar
left join sales_daily on calendar_day :: date =  sales_day :: date

2 个答案:

答案 0 :(得分:2)

您将不得不使用窗口函数来累计金额。下面的查询为每个产品和每个日历日创建虚拟的0值销售。这样可以确保输出的每个产品和每个日历日都有累积销售值行。

select sales_day as Calendar_Day,
       product_Code,
       sales, 
       sum(sales) over (partition by product_Code order by sales_day  rows between 1 PRECEDING and current row) running_salest1day
from (
  select sales_day, product_Code, sum(sales) as sales
  from (
    select sales_day, product_Code, sales -- Actual Sales entry
    from sales_daily
    union all
    select calendar_day as sales_day, dp.product_Code, 0 as sales -- Dummy Sales entry for each date
    from (
      select distinct product_Code from sales_daily 
    ) dp cross join calendar
  ) sd
  group by sales_day, product_Code
) asd

这里是SQL Fiddle

答案 1 :(得分:1)

在日期和产品之间使用cross join生成行。然后使用累计和。所以:

select c.calendar_Day, p.product_Code, sales,
       sum(sales) over (partition by p.product_code
                        order by c.calendar_Day
                        rows between n preceding and current row
                       ) and running_sales
from calendar c cross join
     (select distinct product_code from sales_daily) p left join
     sales_daily s
     on c.calendar_day = s.sales_day and p.product_code = s.product_code
order by p.product_code, c.calendar_day;