查询目标:
按地区展示比赛。
查询:
SELECT school_data_schools_outer.district_id,
school_data_race_ethnicity_raw_outer.year,
school_data_race_ethnicity_raw_outer.race,
ROUND(
SUM( school_data_race_ethnicity_raw_outer.count) /
(SELECT SUM(count)
FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_inner
INNER JOIN school_data_schools as school_data_schools_inner
USING (school_id)
WHERE school_data_schools_outer.district_id = school_data_schools_inner.district_id
AND school_data_race_ethnicity_raw_outer.year = school_data_race_ethnicity_raw_inner.year) * 100, 2)
FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_outer
INNER JOIN school_data_schools as school_data_schools_outer USING (school_id)
GROUP BY school_data_schools_outer.district_id,
school_data_race_ethnicity_raw_outer.year,
school_data_race_ethnicity_raw_outer.race
mysql> explain SELECT school_data_schools_outer.district_id, school_data_race_ethnicity_raw_outer.year, school_data_race_ethnicity_raw_outer.race,ROUND(SUM(school_data_race_ethnicity_raw_outer.count)/( SELECT SUM(count) FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_inner INNER JOIN school_data_schools as school_data_schools_inner USING (school_id) WHERE school_data_schools_outer.district_id = school_data_schools_inner.district_id and school_data_race_ethnicity_raw_outer.year = school_data_race_ethnicity_raw_inner.year ) * 100,2) FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_outer INNER JOIN school_data_schools as school_data_schools_outer USING (school_id) GROUP BY school_data_schools_outer.district_id, school_data_race_ethnicity_raw_outer.year, school_data_race_ethnicity_raw_outer.race;
+----+--------------------+--------------------------------------+--------+----------------------------+---------+---------+----------------------------------------------------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------------------------+--------+----------------------------+---------+---------+----------------------------------------------------------------------+-------+---------------------------------+
| 1 | PRIMARY | school_data_race_ethnicity_raw_outer | ALL | school_id,school_id_2 | NULL | NULL | NULL | 84012 | Using temporary; Using filesort |
| 1 | PRIMARY | school_data_schools_outer | eq_ref | PRIMARY | PRIMARY | 257 | rocdocs_main_drupal_7.school_data_race_ethnicity_raw_outer.school_id | 1 | |
| 2 | DEPENDENT SUBQUERY | school_data_race_ethnicity_raw_inner | ref | school_id,year,school_id_2 | year | 4 | func | 8402 | |
| 2 | DEPENDENT SUBQUERY | school_data_schools_inner | eq_ref | PRIMARY | PRIMARY | 257 | rocdocs_main_drupal_7.school_data_race_ethnicity_raw_inner.school_id | 1 | Using where |
+----+--------------------+--------------------------------------+--------+----------------------------+---------+---------+----------------------------------------------------------------------+-------+---------------------------------+
4 rows in set (0.00 sec)
mysql>
mysql> describe school_data_race_ethnicity_raw;
+-----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| school_id | varchar(255) | NO | MUL | NULL | |
| year | int(11) | NO | MUL | NULL | |
| race | varchar(255) | NO | | NULL | |
| count | int(11) | NO | | NULL | |
+-----------+--------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
mysql> describe school_data_schools;
+-------------+----------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------------+------+-----+---------+-------+
| school_id | varchar(255) | NO | PRI | NULL | |
| grade_level | varchar(255) | NO | | NULL | |
| district_id | varchar(255) | NO | | NULL | |
| school_name | varchar(255) | NO | | NULL | |
| address | varchar(255) | NO | | NULL | |
| city | varchar(255) | NO | | NULL | |
| lat | decimal(20,10) | NO | | NULL | |
| lon | decimal(20,10) | NO | | NULL | |
+-------------+----------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
注意:我也尝试过:
select sds.school_id,
detail.year,
detail.race,
ROUND((detail.count / summary.total) * 100 ,2) as percent
FROM school_data_race_ethnicity_raw as detail
inner join school_data_schools as sds USING (school_id)
inner join (
select sds2.district_id, year, sum(count) as total
from school_data_race_ethnicity_raw
inner join school_data_schools as sds2 USING (school_id)
group by sds2.district_id, year
) as summary on summary.district_id = sds.district_id
and summary.year = detail.year
答案 0 :(得分:0)
这很慢,因为:
最好的方法是不使用相关子查询,但如果没有,那么为了使其快速运行,您需要使用covering indexes以便整个内部查询(以及其他部分通过他们自己的索引)可以使用索引快速运行。有关索引主题的精彩教程,请检查this。它教会了我很多!现在,您的内部查询只使用school_data_race_ethnicity_raw上的年份索引,因此必须通过为84000个计算中的每一个读取8000行来查找所需的其余内容。索引会使这个更快,例如在school_data_race_ethnicity_raw上创建一个复合索引,你会发现它有帮助:
CREATE index inner_composite ON school_data_race_ethnicity_raw (year, district_id, schoolid, count)
这将允许从索引中获取WHERE中使用的所有字段,然后是连接字段,然后是选择所需的字段。您应该会在解释结果的“关键”列中看到它。此外,如果你做对了,你会在最右边的列中看到'使用索引',表明没有发生表访问,这快了几个数量级。
您可以通过为查询提及的列添加大量索引来实验快速和脏的样式,并查看在键列中拾取的内容。如果出现了某些内容,请阅读您的查询以查看该表中正在使用的其他列,然后添加一个新索引,并在右侧添加这些列,并查看是否更好。一旦找到有用的索引,请记得删除未使用的索引。
MySQL不允许你直接索引列的SUM,这是最快的方式,所以除非你想转移到另一个数据库(好主意,如果你可以),这总是有点慢。
答案 1 :(得分:-1)
这应该是您需要汇总数据以获得按地区划分的比赛所需的全部数据,不确定为什么您在原始数据中进行了如此多的数学运算,因为没有必要实现您的目标,并且正在强迫一些疯狂的子查询。
SELECT SUM(students.count) as studentCount, School.district_id, students.race
FROM school_data_schools schools,
school_data_race_ethnicity_raw students
WHERE shools.school_id = students.school_id
GROUP BY district_id, race
您可能还需要school_data_race_ethnicity_raw.school_id上的索引(单独,不作为多列键的一部分)
编辑并不知道OP正在寻找百分比细分,而不仅仅是总数
SELECT ((studentCount / districtTotal) * 100) as percentage, district_id, race
FROM(
SELECT SUM(students.count) as studentCount, Schools.district_id, students.race,
(SELECT SUM(inStudents.count)
FROM school_data_schools inSchools,
school_data_race_ethnicity_raw inStudents
WHERE inSchools.school_id = inStudents.school_id
AND inSchools.district_ID = Schools.district_id
GROUP BY inSchools.district_id) as districtTotal
FROM school_data_schools schools,
school_data_race_ethnicity_raw students
WHERE schools.school_id = students.school_id
GROUP BY district_id, race
) table1
这将很快运行,仍然需要确保school_data_race_ethnicity_raw.school_id上有一个不属于多列索引的索引。你可以在行动here中看到它,虽然我的测试用例相当小,但它确实可以检查出来。