我正在尝试计算以下内容:
我的数据集如下:
| Part | Step | Step_Start | Part_Finish | TheTime |
|:----:|:----:|:----------:|:-----------:|:-----------:|
| 1 | 200 | 15-Aug-18 | 19-Jun-19 | 307.4926273 |
| 2 | 200 | 7-Jun-19 | 19-Jun-19 | 11.4434375 |
| 3 | 200 | 17-Sep-18 | 4-Feb-19 | 139.4360417 |
| 4 | 200 | 30-Jan-19 | 4-Feb-19 | 4.356666667 |
| 5 | 200 | 1-Oct-18 | 18-Feb-19 | 139.4528009 |
| 6 | 200 | 13-Feb-19 | 18-Feb-19 | 4.50375 |
| 7 | 200 | 17-Oct-18 | 28-Mar-19 | 161.7007176 |
| 8 | 200 | 12-Nov-18 | 28-Mar-19 | 135.630625 |
| 9 | 200 | 25-Oct-18 | 26-Feb-19 | 123.6026968 |
| 10 | 200 | 22-Feb-19 | 26-Feb-19 | 3.628090278 |
| 11 | 200 | 30-Oct-18 | 3-Jan-19 | 64.51466435 |
| 12 | 200 | 12-Dec-18 | 3-Jan-19 | 21.48703704 |
| 13 | 200 | 15-Nov-18 | 14-Jan-19 | 59.41373843 |
| 14 | 200 | 7-Jan-19 | 14-Jan-19 | 6.621828704 |
| 15 | 200 | 15-Nov-18 | 12-Jan-19 | 57.62283565 |
| 16 | 200 | 8-Jan-19 | 12-Jan-19 | 3.264398148 |
| 17 | 200 | 15-Nov-18 | 7-Mar-19 | 111.5082523 |
| 18 | 200 | 4-Mar-19 | 7-Mar-19 | 2.153587963 |
| 19 | 200 | 16-Nov-18 | 23-May-19 | 187.6931481 |
| 20 | 200 | 16-Nov-18 | 3-Jan-19 | 47.47916667 |
| 21 | 200 | 17-Dec-18 | 3-Jan-19 | 16.62722222 |
| 22 | 200 | 20-Nov-18 | 14-Feb-19 | 85.6115625 |
| 23 | 200 | 9-Feb-19 | 14-Feb-19 | 4.520787037 |
| 24 | 200 | 19-Nov-18 | 14-Jan-19 | 55.53342593 |
| 25 | 200 | 9-Jan-19 | 14-Jan-19 | 4.721400463 |
| 26 | 200 | 26-Nov-18 | 9-Jan-19 | 43.50748843 |
| 27 | 200 | 4-Jan-19 | 9-Jan-19 | 4.417164352 |
| 28 | 200 | 26-Nov-18 | 21-Jan-19 | 55.59988426 |
| 29 | 200 | 13-Jan-19 | 21-Jan-19 | 7.535 |
| 30 | 200 | 16-Jan-19 | 21-Jan-19 | 4.618796296 |
| 31 | 200 | 26-Nov-18 | 11-Jan-19 | 45.42148148 |
| 32 | 200 | 4-Jan-19 | 11-Jan-19 | 6.316921296 |
| 33 | 200 | 4-Dec-18 | 24-Jan-19 | 50.3669213 |
| 34 | 200 | 18-Jan-19 | 24-Jan-19 | 5.589467593 |
| 35 | 200 | 4-Dec-18 | 31-Jan-19 | 57.26877315 |
| 36 | 200 | 22-Jan-19 | 31-Jan-19 | 8.240034722 |
| 37 | 200 | 5-Dec-18 | 28-Jun-19 | 204.5283912 |
| 38 | 200 | 26-Jun-19 | 28-Jun-19 | 1.508252315 |
| 39 | 200 | 9-Feb-19 | 19-Feb-19 | 9.532893519 |
| 40 | 200 | 7-Dec-18 | 14-Feb-19 | 68.51900463 |
| 41 | 200 | 5-Feb-19 | 14-Feb-19 | 8.641076389 |
| 42 | 200 | 11-Dec-18 | 25-Jan-19 | 44.50501157 |
| 43 | 200 | 22-Jan-19 | 25-Jan-19 | 2.511435185 |
| 44 | 200 | 13-Dec-18 | 17-Jan-19 | 34.43806713 |
| 45 | 200 | 14-Jan-19 | 17-Jan-19 | 2.210972222 |
| 46 | 200 | 13-Dec-18 | 24-Jan-19 | 41.38921296 |
| 47 | 200 | 17-Jan-19 | 24-Jan-19 | 6.444664352 |
| 48 | 200 | 10-Jan-19 | 7-Feb-19 | 27.43130787 |
| 49 | 200 | 1-Feb-19 | 7-Feb-19 | 5.349189815 |
| 50 | 200 | 18-Dec-18 | 4-Feb-19 | 47.50416667 |
| 51 | 200 | 29-Jan-19 | 4-Feb-19 | 5.481979167 |
| 52 | 200 | 3-Jan-19 | 30-Jan-19 | 26.46112269 |
| 53 | 200 | 23-Jan-19 | 30-Jan-19 | 6.712175926 |
| 54 | 200 | 4-Jan-19 | 5-Feb-19 | 31.49590278 |
| 55 | 200 | 30-Jan-19 | 5-Feb-19 | 5.385798611 |
| 56 | 200 | 23-Jan-19 | 20-Mar-19 | 55.296875 |
| 57 | 200 | 21-Feb-19 | 20-Mar-19 | 26.06854167 |
| 58 | 200 | 22-Jan-19 | 14-Mar-19 | 50.57989583 |
| 59 | 200 | 8-Mar-19 | 14-Mar-19 | 5.147303241 |
| 60 | 200 | 22-Jan-19 | 21-Feb-19 | 29.46405093 |
| 61 | 200 | 14-Feb-19 | 21-Feb-19 | 6.701724537 |
| 62 | 200 | 24-Jan-19 | 23-Apr-19 | 88.50689815 |
| 63 | 200 | 17-Apr-19 | 23-Apr-19 | 5.725405093 |
| 64 | 200 | 28-Jan-19 | 21-Feb-19 | 23.50082176 |
| 65 | 200 | 13-Feb-19 | 21-Feb-19 | 7.115717593 |
| 66 | 200 | 31-Jan-19 | 28-Feb-19 | 27.55881944 |
| 67 | 200 | 25-Feb-19 | 28-Feb-19 | 2.633738426 |
| 68 | 200 | 31-Jan-19 | 27-Feb-19 | 26.46105324 |
| 69 | 200 | 23-Feb-19 | 27-Feb-19 | 3.531423611 |
| 70 | 200 | 1-Feb-19 | 28-Feb-19 | 26.45835648 |
| 71 | 200 | 27-Feb-19 | 28-Feb-19 | 0.471296296 |
| 72 | 200 | 6-Feb-19 | 27-Feb-19 | 20.54436343 |
| 73 | 200 | 23-Feb-19 | 27-Feb-19 | 3.598854167 |
| 74 | 200 | 6-Feb-19 | 5-Mar-19 | 26.54347222 |
| 75 | 200 | 28-Feb-19 | 5-Mar-19 | 4.303773148 |
| 76 | 200 | 12-Feb-19 | 6-Mar-19 | 21.56993056 |
| 77 | 200 | 1-Mar-19 | 6-Mar-19 | 4.597615741 |
| 78 | 200 | 12-Feb-19 | 14-Mar-19 | 29.50417824 |
| 79 | 200 | 7-Mar-19 | 14-Mar-19 | 6.083541667 |
| 80 | 200 | 28-Feb-19 | 28-Mar-19 | 27.5291088 |
| 81 | 200 | 25-Mar-19 | 28-Mar-19 | 2.637824074 |
| 82 | 200 | 29-Jan-19 | 28-Feb-19 | 29.34280093 |
| 83 | 200 | 21-Feb-19 | 28-Feb-19 | 6.233831019 |
| 84 | 200 | 19-Feb-19 | 30-Apr-19 | 69.51832176 |
| 85 | 200 | 7-Feb-19 | 5-Mar-19 | 25.74865741 |
| 86 | 200 | 27-Feb-19 | 5-Mar-19 | 5.380034722 |
| 87 | 200 | 21-Feb-19 | 21-Mar-19 | 27.56310185 |
| 88 | 200 | 19-Mar-19 | 21-Mar-19 | 1.161828704 |
| 89 | 200 | 26-Feb-19 | 28-Mar-19 | 29.41315972 |
| 90 | 200 | 22-Mar-19 | 28-Mar-19 | 5.673703704 |
| 91 | 200 | 26-Feb-19 | 28-Mar-19 | 29.5131713 |
| 92 | 200 | 20-Mar-19 | 28-Mar-19 | 7.073414352 |
| 93 | 200 | 28-Feb-19 | 15-Apr-19 | 45.63513889 |
| 94 | 200 | 5-Apr-19 | 15-Apr-19 | 9.479456019 |
| 95 | 200 | 1-Mar-19 | 29-Mar-19 | 27.54568287 |
| 96 | 200 | 25-Mar-19 | 29-Mar-19 | 3.044340278 |
| 97 | 200 | 4-Mar-19 | 27-Mar-19 | 22.52392361 |
| 98 | 200 | 21-Mar-19 | 27-Mar-19 | 5.074421296 |
| 99 | 200 | 14-Feb-19 | 19-Mar-19 | 32.54349537 |
| 100 | 200 | 13-Mar-19 | 19-Mar-19 | 5.265266204 |
我当前的SQL查询如下:
SELECT
Step,
ROUND(MEDIAN(Part_Finish - Step_Start), 2) AS "The_Median",
ROUND(AVG(Part_Finish - Step_Start), 2) AS "The_Average",
PERCENTILE_CONT(0.20) WITHIN GROUP (ORDER BY (Part_Finish - Step_Start) ASC) AS "Best_Time",
PERCENTILE_CONT(0.80) WITHIN GROUP (ORDER BY (Part_Finish - Step_Start) ASC) AS "Worst_Time"
FROM
myTbl
GROUP BY
Step
但是,我不确定我的结果是否正确,因为我认为我没有正确使用PERCENTILE_CONT()
。如何根据最佳的20%的数据和最差的20%的数据,使用PERCENTILE_CONT()
(或另一种方法)找到平均值或中位数(以较容易为准)的“完成时间”? / p>
我希望一些结果看起来像这样:
| Step | The_Average | The_Median | Best_Time | Worst_Time |
|:----:|:-----------:|:----------:|:---------:|:----------:|
| 200 | < value > | < value > | < value > | < value > |
其中< value >
字段是正确计算的平均值,中位数以及数据集的最佳和最差值。最佳和最差是通过找到数据前20%(即最小的时间)或数据中最差的20%(即最大的时间)的平均值或中位数来计算的。
答案 0 :(得分:1)
PERCENTILE_CONT
是一个窗口函数,因此,如果您只想要一个由具有标量值的单个记录组成的结果集,则可以尝试选择不同的内容:
SELECT DISTINCT
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Part_Finish - Step_Start) AS "The_median",
ROUND(AVG(Part_Finish - Step_Start) OVER (ORDER BY Part_Finish - Step_Start), 2) AS "The_Average",
PERCENTILE_CONT(0.20) WITHIN GROUP (ORDER BY Part_Finish - Step_Start) AS "Best_Time",
PERCENTILE_CONT(0.80) WITHIN GROUP (ORDER BY Part_Finish - Step_Start) AS "Worst_Time"
FROM myTbl;
采用上述方法的原因是,在整个表上选择窗口功能PERCENTILE_CONT
只会返回整个表作为结果集。但是,当您使用它时,每个记录的值将始终相同。因此,我们可以采用不同的值来获得单个结果。
如果您期望每个Step
的值都有不同的报告,那么您应该在对PARTITION BY
的调用中使用PERCENTILE_CONT
,例如
PERCENTILE_CONT(0.5) WITHIN GROUP (PARTITION BY Step
ORDER BY Part_Finish - Step_Start) AS "The_median"