在ubuntu 12.04 LTS上使用mysql版本5.6.14-enterprise-commercial-advanced-log时,从这些表中查询数据时遇到以下行为:
CREATE TABLE `a` (
`id` varchar(32) DEFAULT NULL,
`request_time` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
) ENGINE=MyISAM DEFAULT CHARSET=latin1
!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(request_time))
(PARTITION p15062116 VALUES LESS THAN (1434895200) ENGINE = MyISAM,
PARTITION p15062117 VALUES LESS THAN (1434898800) ENGINE = MyISAM,
PARTITION p15062118 VALUES LESS THAN (1434902400) ENGINE = MyISAM,
...
PARTITION rest VALUES LESS THAN MAXVALUE ENGINE = MyISAM)
CREATE TABLE `b` (
`id` varchar(50) NOT NULL,
`start_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`item` int(11) DEFAULT NULL,
`item2` int(11) DEFAULT NULL,
PRIMARY KEY (`id`,`start_time`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(start_time))
(PARTITION p15062516 VALUES LESS THAN (1435240800) ENGINE = MyISAM,
PARTITION p15062517 VALUES LESS THAN (1435244400) ENGINE = MyISAM
....
PARTITION rest VALUES LESS THAN MAXVALUE ENGINE = MyISAM)
使用此查询会产生1秒的运行时间:
SELECT SQL_NO_CACHE request_time, item
FROM a left join
(select *
from b
where start_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00'
) c using(id)
where request_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00'
解释输出:
+----+-------------+------------+------+---------------+-------------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+-------------+---------+------+--------+-------------+
| 1 | PRIMARY | a | ALL | NULL | NULL | NULL | NULL | 336972 | Using where |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 152 | func | 10 | Using where |
| 2 | DERIVED | b | ALL | NULL | NULL | NULL | NULL | 39508 | Using where |
+----+-------------+------------+------+---------------+-------------+---------+------+--------+-------------+
3 rows in set (0.00 sec)
mysql> explain partitions SELECT SQL_NO_CACHE request_time, item FROM a left join (select * from b where start_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00') b using(id) where request_time between '2015-06-28 10:00:
+----+-------------+------------+---------------------+------+---------------+-------------+---------+------+--------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+---------------------+------+---------------+-------------+---------+------+--------+-------------+
| 1 | PRIMARY | a | p15062810,p15062811 | ALL | NULL | NULL | NULL | NULL | 336972 | Using where |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 152 | func | 10 | Using where |
| 2 | DERIVED | b | p15062810,p15062811 | ALL | NULL | NULL | NULL | NULL | 39508 | Using where |
+----+-------------+------------+---------------------+------+---------------+-------------+---------+------+--------+-------------+
并使用此查询导致30秒运行时:
SELECT SQL_NO_CACHE request_time, item
FROM a
left join b using(id)
where request_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00'
and (start_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00' or start time is null) ;
解释输出:
+----+-------------+-----------+------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 336972 | Using where |
| 1 | SIMPLE | b | ref | PRIMARY | PRIMARY | 152 | func | 395 | Using where; Using index |
+----+-------------+-----------+------+---------------+---------+---------+------+--------+--------------------------+
mysql> explain partitions SELECT SQL_NO_CACHE request_time, item FROM a left join b using(id) where request_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00' and start_time between '2015-06-28 10:00:00' and '2015-06-28 11:
+----+-------------+-----------+---------------------+------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+---------------------+------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | a | p15062810,p15062811 | ALL | NULL | NULL | NULL | NULL | 336972 | Using where |
| 1 | SIMPLE | b | p15062810,p15062811 | ref | PRIMARY | PRIMARY | 152 | func | 395 | Using where; Using index |
+----+-------------+-----------+---------------------+------+---------------+---------+---------+------+--------+--------------------------+
我期望基于id索引和1小时分区的使用来从第二个查询获得类似或更好的结果。 两个表都有1000000条记录。
你能解释为什么第一个查询比第二个查询更有效吗?
我们可以重构第一个查询,以便它可以成为视图或可重用查询,而不是为每个连接重建子查询吗?
感谢
答案 0 :(得分:0)
我的猜测是SQL引擎很难确定第二个查询的分区。您可以尝试将其写为:
SELECT SQL_NO_CACHE count(*)
FROM a left join
b
using(id)
where request_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00' and
start_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00'
UNION ALL
SELECT SQL_NO_CACHE count(*)
FROM a left join
b
using (id)
WHERE request_time between '2015-06-28 10:00:00' and '2015-06-28 11:00:00' and
start time is null ;
我意识到这会返回两行。如果这有助于引擎找到正确的分区,那么将这些值一起添加就很容易了。