我有以下数据:
ID |MPERIOD|FRDATE |FR
===+=======+==========+==
100|2017M01|01.01.2017|60 \ \ \
101|2017M01|02.01.2017|75 > YtD 2017M01 | |
103|2017M01|08.01.2017|48 / > Ytd 2017M02 |
104|2017M02|06.02.2017|55 | > YtD 2017M03
105|2017M02|15.02.2017|63 / |
106|2017M03|18.03.2017|41 |
107|2017M03|22.03.2017|71 /
...|.......|..........|..
我需要计算每个月80%的百分位数以及当月(从一年的开始到当前计算时刻)的YtD(。)
我使用以下SQL查询:
SELECT DISTINCT mperiod,
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr OVER (PARTITION BY mperiod),2) "80%_FR",
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr OVER (PARTITION BY SUBSTR(mperiod,1,4)),2) "80%_FR_YtD"
FROM mytable
ORDER BY 1
如果我在一个月的最后一天运行此查询,但我没有下个月的数据,那么此SQL将正确计算YtD值。例如,如果我有前六个月的数据并且没有第七个月的数据,并计算第六个月的数据,那么使用年度分区OVER (PARTITION BY SUBSTR(mperiod,1,4)
计算将计算正确的YtD值。但如果我在本月之后有数据,它将被包含在PARTITION BY中,并且不会计算到那一刻。
前几个月如何追溯计算YtD!?例如,第三个月的YtD计算应该只计算一年中前三个月的计算,而不是一年中的所有月份。
答案 0 :(得分:1)
由于您不能使用窗口条款或在PERCENTILE_CONT(boo!)中按列添加其他顺序,这是实现目标的一种方法。注:它并不漂亮,我确信它不会表现得非常高效,但至少应该有效!
WITH mytable AS (SELECT 100 ID, '2017M01' mperiod, to_date('01/01/2017', 'dd/mm/yyyy') frdate, 60 fr FROM dual UNION ALL
SELECT 101 ID, '2017M01' mperiod, to_date('02/01/2017', 'dd/mm/yyyy') frdate, 75 fr FROM dual UNION ALL
SELECT 103 ID, '2017M01' mperiod, to_date('08/01/2017', 'dd/mm/yyyy') frdate, 48 fr FROM dual UNION ALL
SELECT 104 ID, '2017M02' mperiod, to_date('06/02/2017', 'dd/mm/yyyy') frdate, 55 fr FROM dual UNION ALL
SELECT 105 ID, '2017M02' mperiod, to_date('15/02/2017', 'dd/mm/yyyy') frdate, 63 fr FROM dual UNION ALL
SELECT 106 ID, '2017M03' mperiod, to_date('18/03/2017', 'dd/mm/yyyy') frdate, 41 fr FROM dual UNION ALL
SELECT 107 ID, '2017M03' mperiod, to_date('22/03/2017', 'dd/mm/yyyy') frdate, 71 fr FROM dual UNION ALL
SELECT 108 ID, '2016M12' mperiod, to_date('22/12/2016', 'dd/mm/yyyy') frdate, 42 fr FROM dual UNION ALL
SELECT 109 ID, '2016M11' mperiod, to_date('22/11/2016', 'dd/mm/yyyy') frdate, 32 fr FROM dual),
unpckd AS (SELECT mt.ID,
mt.mperiod,
mt.frdate,
mt.fr,
CASE WHEN substr(mt.mperiod, -2) <= d.id THEN SUBSTR(mt.mperiod, 1, 5) || to_char(d.id, 'fm09')
END new_mperiod,
d.id dummy_id
FROM mytable mt
INNER JOIN (SELECT LEVEL ID
FROM dual
CONNECT BY LEVEL <= 12) d ON substr(mt.mperiod, -2) <= d.id),
res AS (SELECT mperiod,
new_mperiod,
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr) OVER (PARTITION BY CASE WHEN mperiod = new_mperiod THEN mperiod END),2) fr_80,
ROUND(PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY fr) OVER (PARTITION BY new_mperiod),2) fr_80_ytd
FROM unpckd)
SELECT DISTINCT new_mperiod mperiod,
fr_80 "80%_FR",
fr_80_ytd "80%_FR_YtD"
FROM res
WHERE new_mperiod = mperiod
ORDER BY 1;
MPERIOD 80%_FR 80%_FR_YtD
-------- ---------- ----------
2016M11 32 32
2016M12 42 40
2017M01 69 69
2017M02 61.4 65.4
2017M03 65 69.4
这通过在数字1到12(一年中的12个月)和mperiod的最后两位之间进行部分交叉连接来实现。一旦我们知道了,我们现在知道行所属的整个ytd周期(即,数字1将与2017M01匹配,2将匹配2017M01和2017M02等),因此您现在可以为此计算值生成标签(我称之为new_mperiod)并使用它进行分区。
显然效率低下(因为部分交叉连接会产生比一年所需的更多的行,这些行没有得到所有月份的数据,后来会被过滤掉,但是我我们无法想出更好的做法。