在 Redshift 中将每月固定值拆分为天数和国家/地区

时间:2021-03-03 14:33:45

标签: sql amazon-redshift

DB-Fiddle

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    country VARCHAR(255),
    sales_date DATE,
    sales_volume DECIMAL,
    fix_costs DECIMAL
);

INSERT INTO sales
(country, sales_date, sales_volume, fix_costs
)
VALUES 

('DE', '2020-01-03', '500', '2000'),
('NL', '2020-01-03', '320', '2000'),
('FR', '2020-01-03', '350', '2000'),
('None', '2020-01-31', '0', '2000'),

('DE', '2020-02-15', '0', '5000'),
('NL', '2020-02-15', '0', '5000'),
('FR', '2020-02-15', '0', '5000'),
('None', '2020-02-29', '0', '5000'),

('DE', '2020-03-27', '180', '4000'),
('NL', '2020-03-27', '670', '4000'),
('FR', '2020-03-27', '970', '4000'),
('None', '2020-03-31', '0', '4000');

预期结果:

sales_date   |   country    |   sales_volume   |     used_fix_costs
-------------|--------------|------------------|------------------------------------------
2020-01-03   |     DE       |       500        |     37.95  (= 2000/31 = 64.5 x 0.59)
2020-01-03   |     FR       |       350        |     26.57  (= 2000/31 = 64.5 x 0.41)
2020-01-03   |     NL       |       320        |      0.00
-------------|--------------|------------------|------------------------------------------
2020-02-15   |     DE       |         0        |     86.21  (= 5000/28 = 172.4 x 0.50)  
2020-02-15   |     FR       |         0        |     86.21  (= 5000/28 = 172.4 x 0.50)  
2020-02-15   |     NL       |         0        |      0.00
-------------|--------------|------------------|------------------------------------------    
2020-03-27   |     DE       |       180        |     20.20  (= 4000/31 = 129.0 x 0.16) 
2020-03-27   |     FR       |       970        |    108.84  (= 4000/31 = 129.0 x 0.84)   
2020-03-27   |     NL       |       670        |      0.00
-------------|--------------|------------------|-------------------------------------------

预期结果中的 used_fix_costs 列计算如下:

步骤 1) 从接下来的步骤中排除国家 NL,但它仍应以值 0 出现在结果中。

步骤 2) 获取每月 fix_costs 的每日费率。(2000/31 = 64.5; 5000/29 = 172.4; 4000/31 = 129.0)

步骤 3) 根据 DE 和 FR 在 sales_volume 中的份额,将每日价值拆分到这些国家/地区。 (500/850 = 0.59; 350/850 = 0.41; 180/1150 = 0.16; 970/1150 = 0.84)

步骤 4) 如果 sales_volume 为 0,则每日费率将 50/50 拆分为 DE 和 FR,如您所见的 2020-02-15


我目前正在使用此查询来获得预期结果:

SELECT
s.sales_date, 
s.country,
s.sales_volume,
s.fix_costs,

 (CASE WHEN country = 'NL' THEN 0
       
       /* Exclude NL from fixed_costs calculation */
       WHEN SUM(CASE WHEN country <> 'NL' THEN sales_volume ELSE 0 END) OVER (PARTITION BY sales_date) > 0
       THEN ((s.fix_costs/ extract(day FROM (date_trunc('month', sales_date + INTERVAL '1 month') - INTERVAL '1 day'))) *
              sales_volume / 
              NULLIF(SUM(s.sales_volume) FILTER (WHERE s.country != 'NL')  OVER (PARTITION BY s.sales_date), 0)
              )
              
        /* Divide fixed_cots equaly among countries in case of no sale*/      
        ELSE (s.fix_costs / extract(day FROM (date_trunc('month', sales_date + INTERVAL '1 month') - INTERVAL '1 day'))) 
              / SUM(CASE WHEN country <> 'NL' THEN 1 ELSE 0 END) OVER (PARTITION by sales_date)
              
        END) AS imputed_fix_costs
        
FROM sales s
WHERE country NOT IN ('None')
GROUP BY 1,2,3,4
ORDER BY 1;

此查询适用于 DB-Fiddle
但是,当我在 Amazon Redshift 上运行它时,我收到了该行的错误消息
FILTER (WHERE pl.sales_Channel NOT IN ('Marketplace','B2B'))

enter image description here

您知道我如何替换/调整这部分查询以使其在 Amazon Redshift 中也能正常工作吗?

1 个答案:

答案 0 :(得分:1)

如果我理解正确,您想为荷兰以外的所有国家/地区定义每天分摊的固定成本:

select s.*,
       (case when country = 'NL' then 0
             when sum(sales_volume) over (partition by sales_date) = 0
             then (fix_costs / datepart(day, last_day(sales_date))) * 1.0 / sum(case when country <> 'NL' then 1 else 0 end) over (partition by sales_date)
             else (fix_costs / datepart(day, last_day(sales_date))) * (sales_volume / sum(case when country <> 'NL' then sales_volume end) over (partition by sales_date))
        end) as apportioned_fix_costs
from sales s
where country <> 'None';

注意:您似乎不希望结果中包含 None,因此它只是被过滤掉了。那么其余的数据似乎都在当月的一个数据上。如果它实际上可以用于多个数据,请在 date_trunc() 子句中使用 partition by

作为参考,Postgres 不支持 last_day()。您可以使用以下表达式:

select extract(day from date_trunc('month', sales_date) + interval '1 month' - interval '1 day')

DB-Fiddle

相关问题