SELECT *
FROM grants
INNER JOIN people on grants.volID=people.vol_id
INNER JOIN org on grants.orgID=org.orgid
order by yearStart DESC
我有这个^连接,它自己运行得很好。一旦我打开行结果并开始遍历它,我运行第二个查询,从另一个表执行计数和日期信息:
SELECT COUNT(Distinct Event_ID) as ME, MAX(Sample_Date) as MaxD
FROM results where orgid=%d
我需要第一次获取数据才能获得ordID,这就是为什么我一次只能通过它们
so it runs like this
Query 1
while($row = mysql_fetch_assoc($result)){
Query 2
while($row1 = mysql_fetch_assoc($result1)){
get some data from 2
} //close 2
get some data from 1 and merge with 2
} //close 1
如果没有在其中推送辅助查询,它会以非常快的速度运行大约230条记录。它减慢到接近20秒!我没有正确构建Count Distinct吗?结果表大约有100,000条记录,但是我通过其他查询来处理这个问题,并且它不会像这样出现问题!如果有帮助,我如何对此进行子查询?
感谢您的任何见解。
答案 0 :(得分:1)
要弄清楚查询中的性能瓶颈,首先应该使用数据库的EXPLAIN功能,以便它可以告诉您它正在做什么。 https://dev.mysql.com/doc/refman/5.0/en/explain.html
听起来您可能没有正确设置某些索引,导致每次循环第一个连接查询的结果时都会扫描不必要的行。检查方法如下例所示:
首先我有一个测试表
mysql> desc test_table;
+-------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(64) | YES | | NULL | |
| description | text | YES | | NULL | |
| published | datetime | YES | | NULL | |
| updated | datetime | YES | | NULL | |
| status | tinyint(1) | YES | | NULL | |
+-------------+-------------+------+-----+---------+----------------+
6 rows in set (0.02 sec)
mysql> show indexes from test_table;
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| test_table | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.01 sec)
mysql> select count(1) from test_table;
+----------+
| count(1) |
+----------+
| 0 |
+----------+
1 row in set (0.02 sec)
接下来我添加几行
mysql> INSERT INTO test_table (name, description, published, status) VALUES ('name1','description 1 goes here',now(),1),('name2','description 2 goes here',now(),1),('name3', 'description 3 goes here', now(),1);
Query OK, 3 rows affected (0.02 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select name, description from test_table where status = 1;
+-------+-------------------------+
| name | description |
+-------+-------------------------+
| name1 | description 1 goes here |
| name2 | description 2 goes here |
| name3 | description 3 goes here |
+-------+-------------------------+
3 rows in set (0.01 sec)
接下来,我使用数据库中的 EXPLAIN 功能来分析我的查询
mysql> EXPLAIN SELECT name, description, status FROM test_table WHERE name = 'name1' AND status = 1;
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | test_table | ALL | NULL | NULL | NULL | NULL | 3 | Using where |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
您可以看到正在扫描 3行以查找记录。我怀疑你的数据库正在扫描第二个查询的所有100K行,你迭代的每一行。这意味着如果100导致第一次查询,则您有1000万行扫描(100 * 100K)。您希望rows列尽可能接近1,这意味着它将使用索引来查找更快的行。
我现在创建一个索引并包含我期望在我的WHERE子句中的列(按照我将添加它们的顺序,注意并非每次都需要使用它们)
mysql> CREATE INDEX idx_so_example ON test_table (name, description (255), status);
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
接下来,我再次尝试 EXPLAIN ,然后立即查看数据库如何使用索引,并且仅扫描 1行。您应该优化索引以获得类似的结果。
mysql> EXPLAIN SELECT name, description, status FROM test_table WHERE name = 'name1' AND status = 1;
+----+-------------+------------+------+----------------+----------------+---------+-------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------------+---------+-------+------+-----------------------+
| 1 | SIMPLE | test_table | ref | idx_so_example | idx_so_example | 195 | const | 1 | Using index condition |
+----+-------------+------------+------+----------------+----------------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)
对于您的数据库,我会在第二个查询中的这三列上添加一个复合索引,假设'results'是基于您的问题的实际表名。
CREATE INDEX idx_some_name ON results (Event_ID, Sample_Date, orgid);
还有一个建议:您的命名约定应该对字段是一致的,或者您使数据库成为记忆和编码的噩梦。选择一个标准并坚持下去,如果使用EventId,SampleDate,OrgId great或event_id,sample_date,org_id,但标准化所有列名和约定,以便稍后尝试查询数据时代码中的语法错误更少。