Question

以下是查询：

select timespans.id as timespan_id, count(*) as num
 from reports, timespans
 where  timespans.after_date >= '2011-04-13 22:08:38' and
        timespans.after_date <= reports.authored_at and
        reports.authored_at < timespans.before_date
 group by timespans.id;

以下是表格defs：

CREATE TABLE `reports` (
  `id` int(11) NOT NULL auto_increment,
  `source_id` int(11) default NULL,
  `url` varchar(255) default NULL,
  `lat` decimal(20,15) default NULL,
  `lng` decimal(20,15) default NULL,
  `content` text,
  `notes` text,
  `authored_at` datetime default NULL,
  `created_at` datetime default NULL,
  `updated_at` datetime default NULL,
  `data` text,
  `title` varchar(255) default NULL,
  `author_id` int(11) default NULL,
  `orig_id` varchar(255) default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_reports_on_title` (`title`),
  KEY `index_content_on_reports` (`content`(128))

CREATE TABLE `timespans` (
  `id` int(11) NOT NULL auto_increment,
  `after_date` datetime default NULL,
  `before_date` datetime default NULL,
  `after_offset` int(11) default NULL,
  `before_offset` int(11) default NULL,
  `is_common` tinyint(1) default NULL,
  `created_at` datetime default NULL,
  `updated_at` datetime default NULL,
  `is_search_chunk` tinyint(1) default NULL,
  `is_day` tinyint(1) default NULL,
  PRIMARY KEY  (`id`),
  KEY `index_timespans_on_after_date` (`after_date`),
  KEY `index_timespans_on_before_date` (`before_date`)

以下是解释：

+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
| id | select_type | table     | type  | possible_keys                                                | key                           | key_len | ref  | rows   | Extra                                        |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9       | NULL |     84 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | reports   | ALL   | NULL                                                         | NULL                          | NULL    | NULL | 183297 | Using where                                  | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+

这是我在authored_at上创建索引后的解释。正如您所看到的，索引实际上并没有被使用（我认为......）

+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
| id | select_type | table     | type  | possible_keys                                                | key                           | key_len | ref  | rows   | Extra                                          |
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+
|  1 | SIMPLE      | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9       | NULL |     86 | Using where; Using temporary; Using filesort   | 
|  1 | SIMPLE      | reports   | ALL   | index_reports_on_authored_at                                 | NULL                          | NULL    | NULL | 183317 | Range checked for each record (index map: 0x8) | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+

报告表中大约有142k行，而timepans表中则少得多。

查询现在大约需要3秒钟。

奇怪的是，如果我在reports.authored_at上添加索引，它实际上会使查询慢得多，大约20秒。我本以为它会做相反的事情，因为它可以很容易地在范围的任何一端找到报告，然后把剩下的报告扔掉，而不是必须检查所有报告。

有人可以澄清吗？我很难过。

Answer 1

尝试将它们合并到单个多列索引中，而不是在单个索引中将before_date和after_date合并为单个多列索引。然后将该索引添加到authored_at。

Answer 2

我像这样重写你的查询：

select t.id, count(*) as num from timespans t 
  join reports r where t.after_date >= '2011-04-13 22:08:38' 
  and r.authored_at >= '2011-04-13 22:08:38' 
  and r.authored_at < t.before_date 
group by t.id order by null;

并更改表的索引

alter table reports add index authored_at_idx(authored_at);

Answer 3

您可以在列after_date上使用数据库的分区功能。它会对你有所帮助。

如何优化搜索某个日期范围内的行的Mysql查询

3 个答案: