如何将group by与应始终包含的值一起使用?

时间:2016-07-19 16:21:04

标签: sql

我问了一个类似的问题;但是,之前我错误地问过它。假设我有下表:

+-----------+------------+-------+
| quiz_type | student_id | score |
+-----------+------------+-------+
| class     | NULL       | 10    |
+-----------+------------+-------+
| class     | NULL       | 9     |
+-----------+------------+-------+
| student   | A          | 5     |
+-----------+------------+-------+
| student   | B          | 7     |
+-----------+------------+-------+
| student   | A          | 6     |
+-----------+------------+-------+

我想获得每个学生的分数的标准偏差,但需要包括每个学生的课程分数。实际上,quiz_type列不存在(只是为了更好地显示示例)。我需要执行GROUP BY student_id,但每个组都包含NULL值。我一直在努力解决这个问题。有一个很好的解决方案吗?

为了举例,我想使用聚合AVG函数来获取如下表:

+------------+---------+
| student_id | Average |
+------------+---------+
| A          | 7.5     |
+------------+---------+
| B          | 8.67    |
+------------+---------+

实际上我将调用STDDEV_SAMP函数。

2 个答案:

答案 0 :(得分:1)

执行此操作的一种聪明方法是以NULL值与每个非NULL条目配对的方式自行加入您的表格。然后,您可以在计算中使用两个 score列。尝试这样的事情:

SELECT t2.student_id,
       SUM(t2.score) / (SELECT SUM(CASE WHEN student_id IS NULL THEN 1 ELSE 0 END) FROM students) AS nonNullScore,
       (SUM(t1.score) / COUNT(*)) * (SELECT SUM(CASE WHEN student_id IS NULL THEN 1 ELSE 0 END) FROM students) AS nullScore
FROM students t1
INNER JOIN students t2
    ON t1.student_id IS NULL AND t2.student_id IS NOT NULL
GROUP BY t2.student_id

我在MySQL Workbench中测试了这个查询,它似乎正在运行。

<强>输出:

student_id | nonNullScore | nullScore
    A      |   11.0000    |  19.0000
    B      |   7.0000     |  19.0000

答案 1 :(得分:1)

从问题中,可以通过将空值的分数添加到总数来调整分数的平均值。然后可以根据每位学生的调整后平均值计算调整后的标准差。

SELECT
  student_id,
  SQRT(AVG(squared_diff)) adjusted_std_deviation
FROM (SELECT
  t.student_id,
  pow((t.score - x.adjmean), 2) squared_diff
FROM t
CROSS JOIN (SELECT avg(1.0*score) adjmean FROM t) x
WHERE student_id IS NOT NULL) y
GROUP BY student_id
ORDER BY 1

Sample Fiddle

Calculating Standard Deviation