两个几乎相同的表之间的性能差异

时间:2012-05-25 11:43:54

标签: mysql

我有两个表都是相同的,除了一个有时间戳值列,另一个有日期时间值列。索引是一样的。值是相同的。

但是当我运行SELECT station, MAX(timestamp) AS max_timestamp FROM stations GROUP BY station;如果站点是带有时间戳的站点时,它执行速度非常快,如果我尝试使用日期时间,那么我还没有看到一个查询执行。在这两种情况下,timestamp列都已编制索引,只有类型更改。

我应该从哪里开始寻找?或者datetime不适合搜索和索引?

以下是EXPLAIN给出的内容:

+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
|  1 | SIMPLE      | stations | range | NULL          | stamp   | 33      | NULL | 1511 | Using index for group-by |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+

+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows    | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
|  1 | SIMPLE      |stations2 | index | NULL          | station | 2       | NULL | 3025467 |       |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+

SHOW

+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stations | CREATE TABLE `stations` (
  `station` varchar(10) COLLATE utf8_bin DEFAULT NULL,
  `available` smallint(6) DEFAULT NULL,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `stamp` (`station`,`timestamp`),
  KEY `time` (`timestamp`),
  KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stations2 | CREATE TABLE `stations2` (
  `station` smallint(5) unsigned NOT NULL,
  `available` smallint(5) unsigned DEFAULT NULL,
  `timestamp` datetime DEFAULT NULL,
  KEY `station` (`station`),
  KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 个答案:

答案 0 :(得分:1)

您可以从EXPLAIN中看到没有用于选择的键(对于possible_keys为NULL)。你没有WHERE子句,所以这是有道理的。

MySQL可以利用索引来确定MAX,它可以利用索引来优化GROUP BY。但是,为了能够优化两者的组合,您需要MAX()函数中的列和GROUP BY子句中的列都在复合索引中。在第一个表中,您将此复合索引作为名为“stamp”的唯一键。 EXPLAIN结果显示MySQL正在使用该索引。

在第二个表中,您没有此复合索引,因此MySQL必须执行更多工作。它必须手动对结果进行分组,并通过手动扫描每一行来保持每个站的MAX值。如果在第二个表上添加相同的复合索引,您将看到两者之间的相似性能。

然而,TIMESTAMP仍将略微优于DATETIME,因为TIMESTAMP被视为单个4字节整数值,其处理速度比8字节特殊DATETIME值更快。数据集越大,您将看到的差异越大。