下表列出了多项考试中的学生成绩数据。
CREATE TABLE grades
AS
SELECT name, exams, grade_poor, grade_fair, grade_good, grade_vgood
FROM ( VALUES
( 'arun' , 8 , 1 , 4 , 2 , 1 ),
( 'neha' , 10 , 3 , 2 , 1 , 4 ),
( 'ram' , 5 , 1 , 1 , 3 , 0 ),
( 'radha' , 8 , 0 , 3 , 1 , 4 )
) AS t(name,exams,grade_poor,grade_fair,grade_good,grade_vgood);
在vgood>的意义上对等级进行排序。好>公平>差
使用这些数据找到每个学生的第50个百分位等级是否可能(或者是否有意义)?例如 - 如果我们将数据视为一系列成绩类别,则在学生姓名为arun
的情况下 - 第50个百分位数为grade_fair
。
答案 0 :(得分:2)
首先你需要取消这个。我们可以这样做......
SELECT name,
ARRAY[grade_poor, grade_fair, grade_good, grade_vgood]
FROM grades
name | array
-------+-----------
arun | {1,4,2,1}
neha | {3,2,1,4}
ram | {1,1,3,0}
radha | {0,3,1,4}
然后我们需要索引成绩......我们用CROSS JOIN LATERAL
来做。我们有4行,数组为4.我们想要4 * 4行。
SELECT name, grades, gs1.x, grades[gs1.x] AS gradeqty
FROM (
SELECT name,
ARRAY[grade_poor, grade_fair, grade_good, grade_vgood]
FROM grades
) AS t(name, grades)
CROSS JOIN LATERAL generate_series(1,4) AS gs1(x)
ORDER BY name, x;
name | grades | x | gradeqty
-------+-----------+---+----------
arun | {1,4,2,1} | 1 | 1
arun | {1,4,2,1} | 2 | 4
arun | {1,4,2,1} | 3 | 2
arun | {1,4,2,1} | 4 | 1
neha | {3,2,1,4} | 1 | 3
neha | {3,2,1,4} | 2 | 2
neha | {3,2,1,4} | 3 | 1
neha | {3,2,1,4} | 4 | 4
radha | {0,3,1,4} | 1 | 0
radha | {0,3,1,4} | 2 | 3
radha | {0,3,1,4} | 3 | 1
radha | {0,3,1,4} | 4 | 4
ram | {1,1,3,0} | 1 | 1
ram | {1,1,3,0} | 2 | 1
ram | {1,1,3,0} | 3 | 3
ram | {1,1,3,0} | 4 | 0
(16 rows)
现在剩下的是,我们需要再次CROSS JOIN LATERAL
再现x(我们的等级),而不是等级
SELECT name,
gs1.x
FROM (
SELECT name,
ARRAY[grade_poor, grade_fair, grade_good, grade_vgood]
FROM grades
) AS t(name, grades)
CROSS JOIN LATERAL generate_series(1,4) AS gs1(x)
CROSS JOIN LATERAL generate_series(1,grades[gs1.x]) AS gs2(x)
ORDER BY name, gs1.x;
name | x
-------+---
arun | 1
arun | 2
arun | 2
arun | 2
arun | 2
arun | 3
arun | 3
arun | 4
neha | 1
neha | 1
neha | 1
neha | 2
neha | 2
neha | 3
neha | 4
neha | 4
neha | 4
neha | 4
radha | 2
radha | 2
radha | 2
radha | 3
radha | 4
radha | 4
radha | 4
radha | 4
ram | 1
ram | 2
ram | 3
ram | 3
ram | 3
(31 rows)
现在我们GROUP BY name
然后我们使用Ordered-Set Aggregate Functions percent_disc
来完成这项工作..
SELECT name, percentile_disc(0.5) WITHIN GROUP (ORDER BY gs1.x)
FROM (
SELECT name,
ARRAY[grade_poor, grade_fair, grade_good, grade_vgood]
FROM grades
) AS t(name, grades)
CROSS JOIN LATERAL generate_series(1,4) AS gs1(x)
CROSS JOIN LATERAL generate_series(1,grades[gs1.x]) AS gs2(x)
GROUP BY name ORDER BY name;
name | percentile_disc
-------+-----------------
arun | 2
neha | 2
radha | 3
ram | 3
(4 rows)
想进一步深入了解它......
SELECT name, (ARRAY['Poor', 'Fair', 'Good', 'Very Good'])[percentile_disc(0.5) WITHIN GROUP (ORDER BY gs1.x)]
FROM (
SELECT name,
ARRAY[grade_poor, grade_fair, grade_good, grade_vgood]
FROM grades
) AS t(name, grades)
CROSS JOIN LATERAL generate_series(1,4) AS gs1(x)
CROSS JOIN LATERAL generate_series(1,grades[gs1.x]) AS gs2(x)
GROUP BY name
ORDER BY name;
name | array
-------+-------
arun | Fair
neha | Fair
radha | Good
ram | Good
(4 rows)
如果我们为新用户提供支持,我们可以获得稍微多变的输出。
INSERT INTO grades (name,grade_poor,grade_fair,grade_good,grade_vgood)
VALUES ('Bob', 0,0,0,100);
name | array
-------+-----------
arun | Fair
Bob | Very Good
neha | Fair
radha | Good
ram | Good
(5 rows)
答案 1 :(得分:1)
SELECT name, exams,
CASE WHEN 0.5 * exams <= grade_poor
THEN 'grade_poor'
WHEN 0.5 * exams <= grade_poor + grade_fair
THEN 'grade_fair'
WHEN 0.5 * exams <= grade_poor + grade_fair + grade_good
THEN 'grade_good'
ELSE 'grade_vgood' END AS median_grade;
这轮比赛结束,所以neha将得分&#34; grade_fair&#34;而radha将得分&#34; grade_good&#34;。如果您想要整理,请将<=
更改为<
。