离散值的直方图数据的百分位数

时间:2017-03-05 03:50:25

标签: sql netezza

我需要在Netezza中实现以下PostgreSQL代码。这基本上根据离散值的直方图数据计算百分位数。 Postgres问题被问及并回答here

CREATE TABLE grades
AS
  SELECT name, exams, grade_poor, grade_fair, grade_good, grade_vgood
  FROM ( VALUES
    ( 'arun'  , 8  , 1 , 4 , 2 , 1 ),
    ( 'neha'  , 10 , 3 , 2 , 1 , 4 ),
    ( 'ram'   ,  5 , 1 , 1 , 3 , 0 ),
    ( 'radha' ,  8 , 0 , 3 , 1 , 4 )
  ) AS t(name,exams,grade_poor,grade_fair,grade_good,grade_vgood);

SELECT name, percentile_disc(0.5) WITHIN GROUP (ORDER BY gs1.x)
FROM (
  SELECT name,
    ARRAY[grade_poor, grade_fair, grade_good, grade_vgood]
  FROM grades
) AS t(name, grades)
CROSS JOIN LATERAL generate_series(1,4) AS gs1(x)
CROSS JOIN LATERAL generate_series(1,grades[gs1.x]) AS gs2(x)
GROUP BY name ORDER BY name;

可以运行代码here

结果输出

 name  | percentile_disc 
-------+-----------------
 arun  |               2
 neha  |               2
 radha |               3
 ram   |               3

1 个答案:

答案 0 :(得分:1)

我相信Netezza支持percentile_disc()。所以主要问题是对数据进行解包:

SELECT name, percentile_disc(0.5) WITHIN GROUP (ORDER BY grade)
FROM ((SELECT name, grade_poor as grade
       FROM grades
      ) UNION ALL
      (SELECT name, grade_fair as grade
       FROM grades
      ) UNION ALL
      (SELECT name, grade_good as grade
       FROM grades
      ) UNION ALL
      (SELECT name, grade_vgood as grade
       FROM grades
      )
     ) g
GROUP BY name
ORDER BY name;