表:
CREATE TABLE IF NOT EXISTS `l_not_200_page` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`server` tinyint(3) unsigned NOT NULL,
`domain` tinyint(3) unsigned NOT NULL,
`page` varchar(128) NOT NULL,
`query_string` varchar(384) NOT NULL,
`status` smallint(5) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_time_domain_status_page` (`time`,`domain`,`status`,`page`),
KEY `page` (`page`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
解释
EXPLAIN SELECT *
FROM `l_not_200_page`
WHERE TIME
BETWEEN TIMESTAMP( '2014-03-25' )
AND TIMESTAMP( '2014-03-25 23:59:59' )
AND domain =1
AND STATUS = 404
GROUP BY PAGE
1
SIMPLE
l_not_200_page
range
idx_time_domain_status_page
idx_time_domain_status_page
7
NULL
1
Using where; Using temporary; Using filesort
它很慢,如何优化?
SQL:
SELECT PAGE,COUNT(*)AS cnt
来自l_not_200_page
在哪里时间
在TIMESTAMP之间(' 2014-03-26 12:00:00')
和TIMESTAMP(' 2014-03-26 12:30:00')
AND domain = 1
AND STATUS = 499
按订单分组按顺序排序
限制100
每日数据量约为900w
答案 0 :(得分:1)
将索引更改为:
create index `idx_domain_status_time_page` on l_not_200_page(`domain`, `status`, `time`, `page`)
当MySQL使用where
子句的索引时,最佳索引具有相等比较中的所有字段,后跟具有不等式的字段,例如between
。以time
作为第一个元素,它不会使用domain
和status
的索引(好吧,它使用索引扫描而不是直接查找)。
为了进一步优化,您可以通过每页选择一行来摆脱group by
:
SELECT lp.* FROM l_not_200_page lp WHERE TIME BETWEEN TIMESTAMP( '2014-03-25' ) AND TIMESTAMP( '2014-03-25 23:59:59' ) AND
domain = 1 AND STATUS = 404 AND
NOT EXISTS (select 1
from l_not_200_page lp2
where lp2.page = lp.page and
lp2.domain = 1 and lp2.status = 404 and
lp2.TIME BETWEEN TIMESTAMP('2014-03-25) AND TIMESTAMP('2014-03-25 23:59:59') AND
lp2.id > lp.id
)
为此,(page, domain, status, time)
上的附加索引会有所帮助。