每日百分位数(连续YtD)

时间:2017-07-24 12:30:46

标签: sql oracle

我有以下数据:

ID |MPERIOD|FRDATE    |FR
===+=======+==========+==
100|2017M01|01.01.2017|60  \              \              \
101|2017M01|02.01.2017|75   > YtD 2017M01  |              |
103|2017M01|08.01.2017|48  /               > Ytd 2017M02  |
104|2017M02|06.02.2017|55                  |              > YtD 2017M03
105|2017M02|15.02.2017|63                 /               |
106|2017M03|18.03.2017|41                                 |
107|2017M03|22.03.2017|71                                /
...|.......|..........|..

我需要计算每个月80%的百分位数以及当月(从一年的开始到当前计算时刻)的YtD(。)

我使用以下SQL查询:

SELECT DISTINCT mperiod,
   ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr OVER (PARTITION BY mperiod),2) "80%_FR", 
   ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr OVER (PARTITION BY SUBSTR(mperiod,1,4)),2) "80%_FR_YtD" 
FROM mytable
ORDER BY 1

如果我在一个月的最后一天运行此查询,但我没有下个月的数据,那么此SQL将正确计算YtD值。例如,如果我有前六个月的数据并且没有第七个月的数据,并计算第六个月的数据,那么使用年度分区OVER (PARTITION BY SUBSTR(mperiod,1,4)计算将计算正确的YtD值。但如果我在本月之后有数据,它将被包含在PARTITION BY中,并且不会计算到那一刻。

前几个月如何追溯计算YtD!?例如,第三个月的YtD计算应该只计算一年中前三个月的计算,而不是一年中的所有月份。

1 个答案:

答案 0 :(得分:1)

由于您不能使用窗口条款或在PERCENTILE_CONT(boo!)中按列添加其他顺序,这是实现目标的一种方法。注:它并不漂亮,我确信它不会表现得非常高效,但至少应该有效!

WITH mytable AS (SELECT 100 ID, '2017M01' mperiod, to_date('01/01/2017', 'dd/mm/yyyy') frdate, 60 fr FROM dual UNION ALL
                 SELECT 101 ID, '2017M01' mperiod, to_date('02/01/2017', 'dd/mm/yyyy') frdate, 75 fr FROM dual UNION ALL
                 SELECT 103 ID, '2017M01' mperiod, to_date('08/01/2017', 'dd/mm/yyyy') frdate, 48 fr FROM dual UNION ALL
                 SELECT 104 ID, '2017M02' mperiod, to_date('06/02/2017', 'dd/mm/yyyy') frdate, 55 fr FROM dual UNION ALL
                 SELECT 105 ID, '2017M02' mperiod, to_date('15/02/2017', 'dd/mm/yyyy') frdate, 63 fr FROM dual UNION ALL
                 SELECT 106 ID, '2017M03' mperiod, to_date('18/03/2017', 'dd/mm/yyyy') frdate, 41 fr FROM dual UNION ALL
                 SELECT 107 ID, '2017M03' mperiod, to_date('22/03/2017', 'dd/mm/yyyy') frdate, 71 fr FROM dual UNION ALL
                 SELECT 108 ID, '2016M12' mperiod, to_date('22/12/2016', 'dd/mm/yyyy') frdate, 42 fr FROM dual UNION ALL
                 SELECT 109 ID, '2016M11' mperiod, to_date('22/11/2016', 'dd/mm/yyyy') frdate, 32 fr FROM dual),
      unpckd AS (SELECT mt.ID,
                        mt.mperiod,
                        mt.frdate,
                        mt.fr,
                        CASE WHEN substr(mt.mperiod, -2) <= d.id THEN SUBSTR(mt.mperiod, 1, 5) || to_char(d.id, 'fm09')
                        END  new_mperiod,
                        d.id dummy_id
                 FROM   mytable mt
                        INNER JOIN (SELECT LEVEL ID
                                    FROM   dual
                                    CONNECT BY LEVEL <= 12) d ON substr(mt.mperiod, -2) <= d.id),
         res AS (SELECT mperiod,
                        new_mperiod,
                        ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr) OVER (PARTITION BY CASE WHEN mperiod = new_mperiod THEN mperiod END),2) fr_80, 
                        ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr) OVER (PARTITION BY new_mperiod),2) fr_80_ytd
                 FROM unpckd)
SELECT DISTINCT new_mperiod mperiod,
                fr_80 "80%_FR",
                fr_80_ytd "80%_FR_YtD"
FROM   res
WHERE  new_mperiod = mperiod
ORDER BY 1;

MPERIOD      80%_FR 80%_FR_YtD
-------- ---------- ----------
2016M11          32         32
2016M12          42         40
2017M01          69         69
2017M02        61.4       65.4
2017M03          65       69.4

这通过在数字1到12(一年中的12个月)和mperiod的最后两位之间进行部分交叉连接来实现。一旦我们知道了,我们现在知道行所属的整个ytd周期(即,数字1将与2017M01匹配,2将匹配2017M01和2017M02等),因此您现在可以为此计算值生成标签(我称之为new_mperiod)并使用它进行分区。

显然效率低下(因为部分交叉连接会产生比一年所需的更多的行,这些行没有得到所有月份的数据,后来会被过滤掉,但是我我们无法想出更好的做法。