AVG function with multiple conditions and columns in SQL Lite

时间:2019-03-17 22:37:24

标签: sql sqlite conditional average

I figured out how to do avg function with condition/s but I couldn't make it work for my specific need. Let's say I'm working with this table called sales_performance

  product_ID  sales_period  sales_qty  sales_index  product_sub goal_met
         C12          0001         15           20          D71        Y
         D71          0001         07           09          C12        N
         F20          0001         25           30          C05        Y
         C05          0001         10           15          F20        N
         C12          0002         15           30          C05        Y
         C05          0002         12           06          C12        N
         D71          0002         30           20          F20        Y
         F20          0002         20           15          D71        N
         C12          0003         05           04          F20        N
         F20          0003         40           35          C12        Y
         D71          0003         20           20          C05        Y
         C05          0003         12           10          D71        N

I want to calculate a new value called sales_index_goal for product C12. Then, the formula for this value would be:

average of sales_index of product 'C12' when Goal_Met = 'Y' and the sales_index of its sub_product when Goal_Met = 'N' during the sales periods before that one.

So for example if I wanted to calculate sales_index_goal for product 'C12' for sales_period 0003, it would be calculated as:

average of (30,20,15) where 30 and 20 are the sales_indexes of product 'C12' in sales periods 1 and 2 and 15 is the sales_index of Product 'F20' in sales period 2.

I'm not having any trouble calculating this value for exact sales periods. However, I'm having a hard time coming up with a query that calculates this value for product 'C12' for all sales periods. Currently, I have written this query which is not working:

SELECT 
    s.*, 
    AVG(CASE 
        WHEN s2.goal_met = "Y" AND s2.product_id = "C2" 
            THEN s2.sales_index 
        WHEN s2.product_id = (
            SELECT s2.product_sub WHERE s2.product_id = "C12"
        ) AND s2.goals_met = "N" 
            THEN s2.sales_index
        ELSE NULL 
    END)
    OVER (
        ORDER BY s.sales_period 
        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
    ) AS sales_index_goal
FROM 
    sales_performance s, 
    sales_performance s2
WHERE s.product_id = "C12" 

I would really appreciate any help on this. Bonus: Calculating this value for all products for all sales periods.

Edit: The below answer works well for calculating sales_index_goal for product C12 however, it doesn't work for product F20 (more detailed reason to why it doesn't work is in my comment below the answer) You can see the result of the query for product F20 here

2 个答案:

答案 0 :(得分:2)

使用窗口功能是一个不错的起点。一个问题在于,您希望对来自不同列的值取平均值,并且可能会有不同的出现次数。

我认为您需要对计算进行分解以计算平均值:对值求和,然后将其除以发生的总和。

考虑:

SELECT *
FROM (
    SELECT 
        s.*,
        (
            0.0 + 
            COALESCE(SUM(CASE WHEN goal_met = 'Y' THEN sales_index END) OVER(
                PARTITION BY product_id 
                ORDER BY sales_period 
                ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), 0)
            + COALESCE(SUM(CASE WHEN goal_met = 'N' THEN sales_index END) OVER(
                PARTITION BY product_sub 
                ORDER BY sales_period 
                ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), 0)
        ) / (
            COALESCE(SUM(goal_met = 'Y') OVER(
                PARTITION BY product_id 
                ORDER BY sales_period 
                ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), 0)
            + COALESCE(SUM(goal_met = 'N') OVER(
                PARTITION BY product_sub 
                ORDER BY sales_period 
                ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), 0)
        ) AS sales_index_goal
    FROM sales_performance s
) x WHERE product_ID = 'C12'

this DB fiddle 中,返回您的示例数据:

| product_ID | sales_period | sales_qty | sales_index | product_sub | goal_met | sales_index_goal   |
| ---------- | ------------ | --------- | ----------- | ----------- | -------- | ------------------ |
| C12        | 1            | 15        | 20          | D71         | Y        |                    |
| C12        | 2            | 15        | 30          | C05         | Y        | 20                 |
| C12        | 3            | 5         | 4           | F20         | N        | 21.666666666666668 |

答案 1 :(得分:1)

不确定这实际上需要窗口功能。

下面的示例仅根据C12产品自动链接表和组。

CREATE TABLE sales_performance
(
 product_ID varchar(3) not null,
 sales_period varchar(4) not null,
 sales_qty char(2) not null,
 sales_index char(2) not null,
 product_sub char(3) not null,
 goal_met char(1)  not null,
 PRIMARY KEY (product_ID, sales_period)
)
INSERT INTO sales_performance
(product_ID,sales_period,sales_qty,sales_index,product_sub,goal_met) 
VALUES
 ('C12','0001','15','20','D71','Y')
,('D71','0001','07','09','C12','N')
,('F20','0001','25','30','C05','Y')
,('C05','0001','10','15','F20','N')
,('C12','0002','15','30','C05','Y')
,('C05','0002','12','06','C12','N')
,('D71','0002','30','20','F20','Y')
,('F20','0002','20','15','D71','N')
,('C12','0003','05','04','F20','N')
,('F20','0003','40','35','C12','Y')
,('D71','0003','20','20','C05','Y')
,('C05','0003','12','10','D71','N')
;
SELECT c12.*, sub.sales_index
FROM sales_performance c12
LEFT JOIN sales_performance sub 
  ON sub.sales_period < c12.sales_period 
 AND 
 (
    (sub.product_ID = c12.product_sub AND sub.goal_met = 'N') OR
    (sub.product_ID = c12.product_ID AND sub.goal_met = 'Y')
 )
WHERE c12.product_id = 'C12'
ORDER BY c12.product_ID, c12.sales_period, sub.product_ID
product_ID | sales_period | sales_qty | sales_index | product_sub | goal_met | sales_index
:--------- | :----------- | :-------- | :---------- | :---------- | :------- | :----------
C12        | 0001         | 15        | 20          | D71         | Y        | null       
C12        | 0002         | 15        | 30          | C05         | Y        | 15         
C12        | 0002         | 15        | 30          | C05         | Y        | 20         
C12        | 0003         | 05        | 04          | F20         | N        | 20         
C12        | 0003         | 05        | 04          | F20         | N        | 30         
C12        | 0003         | 05        | 04          | F20         | N        | 15         
SELECT 
 c12.*
 , ROUND(AVG(sub.sales_index),1) AS sales_index_goal
FROM sales_performance c12
LEFT JOIN sales_performance sub 
  ON sub.sales_period < c12.sales_period 
 AND 
 (
    (sub.product_ID = c12.product_sub AND sub.goal_met = 'N') OR
    (sub.product_ID = c12.product_ID AND sub.goal_met = 'Y')
 )
WHERE c12.product_id = 'C12' 
GROUP BY c12.product_ID, c12.sales_period
ORDER BY c12.product_ID, c12.sales_period;
product_ID | sales_period | sales_qty | sales_index | product_sub | goal_met | sales_index_goal
:--------- | :----------- | :-------- | :---------- | :---------- | :------- | :---------------
C12        | 0001         | 15        | 20          | D71         | Y        | null            
C12        | 0002         | 15        | 30          | C05         | Y        | 17.5            
C12        | 0003         | 05        | 04          | F20         | N        | 21.7            

db <>提琴here