我正在建立一个数据库,用于跟踪课后教育公司的学生记录,包括上课和学生信息。
我要做的是编写一个查询,该查询可以返回我们从每所学校注册的学生人数,还可以将学校组合在一起,这些学校的总数低于一定比例(我希望在一张图表,但我们有很多学校,只有一名学生来自该学校,我不希望图表有50个酒吧或馅饼片等。)
所以而不是
+-------------+------------+
| School Name | # Students |
+-------------+------------+
| School A | 52 |
| School B | 27 |
| School C | 15 |
| School D | 2 |
| School E | 1 |
| School F | 1 |
+-------------+------------+
我想要
+---------------+------------+
| School Name | # Students |
+---------------+------------+
| School A | 52 |
| School B | 27 |
| School C | 15 |
| Other Schools | 4 |
+---------------+------------+
以下是我现在所使用的查询的简化形式,它有效,但在使用多个Selects查询相同信息时有点多余。无论如何都要减少冗余吗?
SELECT @enrollmentSum := COUNT(StudentEnrollmentID) FROM StudentEnrollment;
SELECT SchoolName, COUNT(StudentEnrollmentID) ECount FROM Student
JOIN StudentEnrollment ON StudentEnrollment.StudentID = Student.StudentID
JOIN School ON Student.SchoolID = School.SchoolID
GROUP BY SchoolName
HAVING Ecount >= .025 * @enrollmentSum
UNION ALL
SELECT "Other Schools" as SchoolName, SUM(Ecount) as ECount FROM
(
SELECT SchoolName, COUNT(StudentEnrollmentID) ECount FROM Student
JOIN StudentEnrollment ON StudentEnrollment.StudentID = Student.StudentID
JOIN School ON Student.SchoolID = School.SchoolID
GROUP BY SchoolName
HAVING Ecount < .025 * @enrollmentSum
) t2
ORDER BY Ecount DESC
如果需要,相关表格的基本结构:
学生
+-----------+-------------+----------+
| StudentID | StudentName | SchoolID |
+-----------+-------------+----------+
学校
+----------+------------+
| SchoolID | SchoolName |
+----------+------------+
StudentEnrollment
+---------------------+-----------+---------+
| StudentEnrollmentID | StudentID | ClassID |
+---------------------+-----------+---------+
感谢您的帮助
答案 0 :(得分:0)
提示:
count(x)返回&#34; x IS NOT NULL&#34;因此,count(主键)= count(*)更易于阅读
&#34; JOIN School ON Student.SchoolID = School.SchoolID&#34;可以改写为&#34;加入学校使用(SchoolID)&#34;它更具可读性,并且只为您提供了一列&#34; SchoolID&#34;在结果集中,如果您使用&#34;选择*&#34;
现在,查询......
SELECT SchoolName, sum(cnt) ECount FROM
(SELECT IF(count(*)>=.025*@enrollmentSum, SchoolName, 'Others') AS SchoolName,
COUNT(*) cnt FROM Student
JOIN StudentEnrollment USING (StudentID)
JOIN School USING (SchoolID)
GROUP BY SchoolName) subq
GROUP BY SchoolName
ORDER BY Ecount DESC
使用IF()会将学校名称替换为&#39;其他人&#39;适用于所有低于门槛的学校。请注意,这是在GROUP BY之后计算的,因此您可以在所选表达式中实际使用count(*)。然后另一个GROUP BY将“其他人”分组。在一起。
修改强>
这是一个非常黑客,但它似乎做你想要的......
SET @total=0;
SELECT IF(cnt/@total>=0.2, SchoolName, 'Others') SN, sum(cnt) FROM (
SELECT SchoolName, cnt, @total:=@total+cnt FROM (
SELECT SchoolName, count(*) cnt FROM st GROUP BY SchoolName
) AS foo -- ORDER BY cnt DESC
) AS bar
GROUP BY SN ORDER BY sum(cnt) DESC;
这是变态。 MySQL似乎总是实现子查询&#34; foo&#34;首先将结果存储在缓冲区中,然后再处理子查询&#34; bar&#34;。我想&#34;订购cnt DESC&#34;是必要的,但如果它被注释掉它似乎也有效。
运行子查询&#34; foo&#34;具有将@total设置为我们想要的值的副作用!
因此,当运行外部子查询时,总数可用。
这种方法的问题是它可能会在没有警告的情况下停止工作,因为它是一个黑客攻击。