计算SQL中实际列和预测列之间的第95百分位差

时间:2016-08-24 16:11:20

标签: sql postgresql tsql

我有像这样的PostgreSQL数据库

表和具有数据类型的相应列是

读数

meas_id - integer(Foreign keyed to Measurement.meas_id)
actual_meas - integer
predicted_meas - integer 
pdatetime - Timestamp with timezone (UTC)
status - Enum('completed', 'inprogress', 'nottaken')

衡量

meas_id - integer
meas_name - string 

Meas_name has measurements length, breadth, width, height

对于每个测量的“长度”和“宽度”,我试图计算过去30天内所有已完成测量的实际值和预测值之间的第95百分位数差异。

我试图这样做,但没有得到它

SELECT 
Measurement.meas_name, 
MIN(Readings.actual_meas - Readings.predicted_meas) AS Difference
FROM
(
    SELECT TOP 95 PERCENT 
    FROM Readings
    ORDER BY Difference DESC
) AS NinetyFivePerc
JOIN Measurement
WHERE NinetyFivePerc.meas_id = Measurement.meas_id
AND NinetyFivePerc.pdatetime >= DATEADD(DAY, -30, GETDATE())
AND Measurement.meas_name IN ('length','breadth')
AND NinetyFivePerc.status = 'completed'

我正在学习SQL,因此请以优化的方式提供输入。

1 个答案:

答案 0 :(得分:1)

Postgres具有percentile_disc()percentile_cont()聚合函数。

所以,你可以这样做:

SELECT m.meas_name, 
       PERCENTILE_CONT(0.05) WITHIN GROUP (ORDER BY r.actual_meas - r.predicted_meas),
       PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY r.actual_meas - r.predicted_meas)
FROM Readings r JOIN
     measurements m
     ON r.meas_id = m.meas_id
WHERE m.meas_name IN ('length', 'breadth') AND
      r.status = 'completed'
GROUP BY m.meas_name;