我有一个查询来获取当前时间和INTERVAL 15分钟之间的数据
表格调用包含39790720项;
SELECT src,unique,dstchannel,chan,calldate
FROM calls
WHERE calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
AND (dstchannel LIKE '%TEXT1/%'
OR dstchannel LIKE '%TEXT2%'
OR dstchannel LIKE '%TEXT3%'
OR dstchannel REGEXP '^SIP/[[:digit:]]{10}-'
OR dstchannel LIKE '%TEXT4%'
OR dstchannel LIKE '%TEXT5%'
OR dstchannel LIKE '%TEXT6%'
OR dstchannel LIKE '%TEXT7%'
)
AND lastdata NOT LIKE '%TEXT8%'
LIMIT 39780720,39790720
Query 1 row in set (1 min 7.38 sec)
+-------------+--------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------------------+-------+
| calldate | datetime | NO | | 0000-00-00 00:00:00 | |
| colum1 | varchar(80) | NO | | | |
| colum11 | varchar(80) | NO | | | |
| src | varchar(80) | NO | | | |
| colum12 | varchar(80) | NO | | | |
| chan | varchar(80) | NO | | | |
| dstchannel | varchar(80) | NO | | | |
| colum2 | varchar(80) | NO | | | |
| colum3 | varchar(80) | NO | | | |
| colum4 | int(11) | NO | | 0 | |
| colum5 | int(11) | NO | | 0 | |
| colum6 | varchar(45) | NO | | | |
| colum7 | int(11) | NO | | 0 | |
| colum8 | varchar(20) | NO | | | |
| colum9 | varchar(32) | NO | | | |
| colum10 | varchar(255) | NO | | | |
+-------------+--------------+------+-----+---------------------+-------+
如何改进查询?
更新
+----+-------------+-------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+----------+-------------+
| 1 | SIMPLE | calls | ALL | NULL | NULL | NULL | NULL | 39791545 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+----------+-------------+
答案 0 :(得分:0)
神圣的极端查询悲观,蝙蝠侠!
您的查询如下所示:
SELECT src,unique,dstchannel,chan,calldate
from calls
WHERE calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
AND ( dstchannel LIKE '%TEXT1/%'
OR dstchannel LIKE '%TEXT2%'
OR dstchannel LIKE '%TEXT3%'
OR dstchannel REGEXP '^SIP/[[:digit:]]{10}-'
OR dstchannel LIKE '%TEXT4%'
OR dstchannel LIKE '%TEXT5%'
OR dstchannel LIKE '%TEXT6%'
OR dstchannel LIKE '%TEXT7%')
AND lastdata NOT LIKE '%TEXT8%'
LIMIT 39780720,39790720
您可以通过在calldate
上添加索引来略微改进此查询。你的calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
条款会有所改善。
但是你的结构化方式永远不会那么快。为什么不呢?
dstchannel LIKE '%TEXT2%'
和类似的条款不能永远,利用索引。为什么不?因为他们必须在整个列中搜索字符串,并且不能只查看列的第一个字符。请注意,dstchannel LIKE 'TEXT2%'
可以利用随机访问的索引。它是一个锚定搜索,从列的开头开始。lastdata NOT LIKE '%TEXT8%'
有同样的问题。但即使它是lastdata NOT LIKE
TEXT8%`它也会导致问题,因为每一行都需要检查。服务器无法弄清楚如何访问一系列数据。OR
条款是一场灾难。它们经常导致MySQL多次扫描相同的数据。LIMIT 39780720,39790720
迫使MySQL在其结果集中掠过它们几乎四十个megarows。这会烧掉MySQL服务器内存,处理器时间和磁盘IO,只是为了丢弃它。你能以某种方式巧妙地使用ORDER BY
子句,这样你就可以检索结果集的第一行而不是跳过它们吗?你能做些什么来解决这个问题?您最好的办法是重新考虑整个LIKE '%something%'
业务。
如果你不能这样做,也许你可以尝试重铸你的查询。我假设你的calls
表上有一个主键。我将其称为id
。
SELECT a.src, a.unique, a.dstchannel, a.chan, a.calldate
FROM a.calls
JOIN (
SELECT id FROM calls
WHERE calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
AND dstchannel LIKE '%TEXT1/%'
UNION
SELECT id FROM calls
WHERE calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
AND dstchannel LIKE '%TEXT2/%'
UNION
SELECT id FROM calls
WHERE calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
AND dstchannel LIKE '%TEXT3/%'
UNION
etcetera.
UNION
SELECT id FROM calls
WHERE calldate BETWEEN (NOW() - INTERVAL 15 MINUTE) AND NOW()
AND dstchannel REGEXP '^SIP/[[:digit:]]{10}-'
UNION
etcetera.
) b ON a.id = b.id
WHERE lastdata NOT LIKE '%TEXT8%'
然后,在列(calldate, dstchannel, id)
上为您的表创建复合索引。然后,MySQL查询计划程序可以使用该索引查找适当的calldate
范围,然后扫描存储在索引中的dstchannel
值以进行匹配,然后提取id值。然后,它会转而使用JOIN
中的id
值来精确地从主表中获取所需的数据。
如果您正在处理呼叫详细记录,则确实需要了解索引。阅读:http://use-the-index-luke.com/