计算每个日期和国家/地区的价值份额并分别处理零值

时间:2021-03-16 09:23:21

标签: sql postgresql

DB-Fiddle

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    event_date DATE,
    country VARCHAR,
    channel VARCHAR,
    sales DECIMAL
);

INSERT INTO sales
(event_date, country, channel, sales)
VALUES 
('2020-02-08', 'DE', 'channel_01', '500'),
('2020-02-08', 'DE', 'channel_02', '400'),
('2020-02-08', 'DE', 'channel_03', '200'),
('2020-02-08', 'FR', 'channel_01', '900'),
('2020-02-08', 'FR', 'channel_02', '800'),
('2020-02-08', 'NL', 'channel_01', '100'),

('2020-03-20', 'DE', 'channel_01', '0'),
('2020-03-20', 'FR', 'channel_01', '0'),
('2020-03-20', 'FR', 'channel_02', '0'),
('2020-03-20', 'FR', 'channel_03', '0'),
('2020-03-20', 'NL', 'channel_01', '0'),

('2020-04-15', 'DE', 'channel_01', '700'),
('2020-04-15', 'FR', 'channel_01', '500'),
('2020-04-15', 'NL', 'channel_01', '850'),
('2020-04-15', 'NL', 'channel_02', '250'),
('2020-04-15', 'NL', 'channel_03', '300');

预期结果:

event_date  |  country  |  share_per_day_per_country |        details of share calculation
------------|-----------|----------------------------|--------------------------------------------
2020-02-08  |     DE    |           0.379            |  = (500+400+200) / (500+400+200+900+800+100)
2020-02-08  |     FR    |           0.586            |  = (900+800)     / (500+400+200+900+800+100)
2020-02-08  |     NL    |           0.034            |  = (100)         / (500+400+200+900+800+100)
------------|-----------|----------------------------|--------------------------------------------
2020-03-20  |     DE    |           0.333            |  = equal split in case of 0 sales
2020-03-20  |     FR    |           0.333            |  = equal split in case of 0 sales
2020-03-20  |     NL    |           0.333            |  = equal split in case of 0 sales
------------|-----------|----------------------------|--------------------------------------------
2020-04-15  |     DE    |           0.269            |  = (700)         / (700+500+850+250+300)
2020-04-15  |     FR    |           0.192            |  = (500)         / (700+500+850+250+300)
2020-04-15  |     NL    |           0.538            |  = (850+250+300) / (700+500+850+250+300)

在我想要的预期结果中

  1. 计算每个国家/地区每天的销售额份额
  2. 如果有一天没有销售,则份额应平均分配到国家/地区的数量。

为了实现这一点,我开发了这个查询:

SELECT
t1.event_date,
t1.country,
t1.sales,
t1.total_sales_per_country,
t1.total_sales_per_day,

(CASE WHEN SUM(t1.sales) OVER (PARTITION BY t1.event_date) = 0 THEN 
100/(COUNT(t1.country) OVER (PARTITION BY t1.event_date))/100::decimal
ELSE t1.total_sales_per_country / t1.total_sales_per_day END) AS share_per_day_per_country
  
FROM

  (SELECT
  s.event_date,
  s.country,
  s.sales,
  SUM(s.sales) OVER (PARTITION BY s.event_date) AS total_sales_per_day,
  SUM(s.sales) OVER (PARTITION BY s.event_date, s.country) AS total_sales_per_country
  FROM sales s
  GROUP BY 1,2,3
  ORDER BY 1,2) t1
  
GROUP BY 1,2,3,4,5
ORDER BY 1,2

这个查询几乎给了我正确的结果。
但是,不是每个 event_date 只列出一次,而是多次列出它们。

我尝试了几种方法(例如 DSTINCT pl.event_date)来解决此问题,但都没有奏效。

如何修改查询以获得完整的预期结果?

1 个答案:

答案 0 :(得分:0)

从选择列表中删除 t1.sales 和 t1.total_sales_per_country。现在使用 distinct 而不是 group by。

<块引用>
CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    event_date DATE,
    country VARCHAR,
    channel VARCHAR,
    sales DECIMAL
);

INSERT INTO sales
(event_date, country, channel, sales)
VALUES 
('2020-02-08', 'DE', 'channel_01', '500'),
('2020-02-08', 'DE', 'channel_02', '400'),
('2020-02-08', 'DE', 'channel_03', '200'),
('2020-02-08', 'FR', 'channel_01', '900'),
('2020-02-08', 'FR', 'channel_02', '800'),
('2020-02-08', 'NL', 'channel_01', '100'),

('2020-03-20', 'DE', 'channel_01', '0'),
('2020-03-20', 'FR', 'channel_01', '0'),
('2020-03-20', 'FR', 'channel_02', '0'),
('2020-03-20', 'FR', 'channel_03', '0'),
('2020-03-20', 'NL', 'channel_01', '0'),

('2020-04-15', 'DE', 'channel_01', '700'),
('2020-04-15', 'FR', 'channel_01', '500'),
('2020-04-15', 'NL', 'channel_01', '850'),
('2020-04-15', 'NL', 'channel_02', '250'),
('2020-04-15', 'NL', 'channel_03', '300');

查询:

<块引用>
SELECT
distinct t1.event_date,
t1.country,
t1.total_sales_per_day,

(CASE WHEN SUM(t1.sales) OVER (PARTITION BY t1.event_date) = 0 THEN 
100/(COUNT(t1.country) OVER (PARTITION BY t1.event_date))/100::decimal
ELSE t1.total_sales_per_country / t1.total_sales_per_day END) AS share_per_day_per_country
  
FROM
  (SELECT
  s.event_date,
  s.country,
  s.sales,
  SUM(s.sales) OVER (PARTITION BY s.event_date) AS total_sales_per_day,
  SUM(s.sales) OVER (PARTITION BY s.event_date, s.country) AS total_sales_per_country
  FROM sales s
  GROUP BY 1,2,3
  ORDER BY 1,2) t1
  ORDER BY 1,2

输出:

<块引用>
<头>
event_date 国家 total_sales_per_day share_per_day_per_country
2020-02-08 DE 2900 0.37931034482758620690
2020-02-08 FR 2900 0.58620689655172413793
2020-02-08 NL 2900 0.03448275862068965517
2020-03-20 DE 0 0.33000000000000000000
2020-03-20 FR 0 0.33000000000000000000
2020-03-20 NL 0 0.33000000000000000000
2020-04-15 DE 2600 0.26923076923076923077
2020-04-15 FR 2600 0.19230769230769230769
2020-04-15 NL 2600 0.53846153846153846154

db<>fiddle here