我的数据库中有大约250个表,每个表都有439340行。
mysql> SHOW CREATE TABLE data.b50d1 ;
+-------+--------------------------------------------------------------------------------------------
CREATE TABLE `b50d1` (
`pTime` int(10) unsigned NOT NULL,
`Slope` double NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`),
KEY `Slope` (`Slope`) USING BTREE
) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=43940 MAX_ROWS=43940 PACK_KEYS=1 ROW_FORMAT=FIXED |
+-------+--------------------------------------------------------------------------------------------
如您所见,每个表中有三列:
PRIMARY KEY
列Slope和STD具有'signed double'值,这些值不同于行到行和表到表。
以下是其中一个表格的小样本:
mysql> SELECT * FROM data.b50d1 limit 10;
+------------+------------+-------------+
| pTime | Slope | STD |
+------------+------------+-------------+
| 1104537600 | 6.38733032 | -1.13387667 |
| 1104537900 | 5.58733032 | -0.93810617 |
| 1104538200 | 5.30135747 | -0.51912757 |
| 1104538500 | 5.4678733 | -0.54460575 |
| 1104538800 | 5.58190045 | -0.46369055 |
| 1104539100 | 5.50226244 | -0.46712018 |
| 1104714000 | 5.31221719 | -0.25210485 |
| 1104714300 | 4.72941176 | 0.00321249 |
| 1104714600 | 5.19638009 | 0.64116376 |
| 1104714900 | 5.12941176 | 0.39599099 |
+------------+------------+-------------+
使用这些表我运行存储过程。此过程包括以下步骤:
步骤1)CREATE TEMPORARY TABLE
MainList ...
步骤2)INSERT
SELECT
语句的结果进入表中。生成的数据集是原始表的过滤组合。
步骤3)SELECT
语句与嵌套JOIN
s迭代遍历TEMPORARY
表(MainList)的每个MainList.STD值,并从其中一个返回第一行符合特定条件的原始表(例如下面的例子)。
步骤4)JOIN
将结果发送到MainList并将其输出给用户。
以下是程序本身:
DELIMITER $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `GetTimeList`(t1 varchar(7),t2 varchar(7),t3 varchar(7),inp1 float,inp2 float,inp3 float,inp4 float,inp5 float,inp6 float,inp7 float,inp8 float,inp9 float,inp10 float)
READS SQL DATA
BEGIN
DROP TABLE IF EXISTS MainList;
CREATE TEMPORARY TABLE MainList(
`pTime` int unsigned NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`),
KEY (`STD`) USING BTREE
) ENGINE = MEMORY;
SET @s = CONCAT('INSERT INTO MainList(pTime,STD) SELECT DISTINCT t1.pTime, t1.STD FROM ',t1,' AS t1 JOIN (',t2,' as t2 ,',t3,' as t3 )',
' ON (( t1.Slope >= ', inp1,
' AND t1.Slope <= ', inp2,
' AND t1.STD >= ', inp3,
' AND t1.STD <= ', inp4,
' AND t2.Slope >= ', inp5,
' AND t2.Slope <= ', inp6,
' AND t3.Slope >= ', inp7,
' AND t3.Slope <= ', inp8,
' ) OR ( t1.Slope <= ', 0-inp1,
' AND t1.Slope >= ', 0-inp2,
' AND t1.STD <= ', 0-inp3,
' AND t1.STD >= ', 0-inp4,
' AND t2.Slope <= ', 0-inp5,
' AND t2.Slope >= ', 0-inp6,
' AND t3.Slope <= ', 0-inp7,
' AND t3.Slope >= ', 0-inp8,
' ) ) AND ((t1.Slope < 0 XOR t1.STD < 0) AND t1.pTime = t2.pTime AND t2.pTime = t3.pTime AND t1.pTime >= ', inp9,
' AND t1.pTime <= ', inp10,' ) ORDER BY t1.pTime'
);
PREPARE stmt FROM @s;
EXECUTE stmt;
SET @q= CONCAT('SELECT m.pTime as OpenTime, CASE WHEN m.STD < 0 THEN 1 ELSE -1 END As Type, mu.pTime As CloseTime from MainList m LEFT JOIN ',t1,' mu ON mu.pTime = ( SELECT DISTINCT md.pTime FROM ',t1,' md WHERE md.pTime>m.pTime',' AND md.pTime <= ', inp10,
' AND SIGN (md.STD)!= SIGN (m.STD) AND ABS(md.STD) >= ABS(m.STD) ORDER BY md.pTime LIMIT 1 )');
PREPARE stmt FROM @q;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
DROP TABLE MainList;
END
为了便于测试,我将上述过程分解为两个单独的查询。以下是伴随“ EXPLAIN EXTENDED ”语句的查询(临时表是事先生成的):
INSERT INTO MainList(pTime,STD)
SELECT
t1.pTime,
t1.STD
FROM
b50d1 AS t1
JOIN(b75d1 AS t2, b100d1 AS t3)ON(
(
t1.Slope >= 2.3169
AND t1.Slope <= 7.0031
AND t1.STD >= - 2.068
AND t1.STD <= - 0.972
AND t2.Slope >= 0.3179
AND t2.Slope <= 5.7221
AND t3.Slope >= 2.6466
AND t3.Slope <= 35.7534
)
OR(
t1.Slope <= - 2.3169
AND t1.Slope >= - 7.0031
AND t1.STD <= 2.068
AND t1.STD >= 0.972
AND t2.Slope <= - 0.3179
AND t2.Slope >= - 5.7221
AND t3.Slope <= - 2.6466
AND t3.Slope >= - 35.7534
)
)
AND(
(t1.Slope < 0 XOR t1.STD < 0)
AND t1.pTime = t2.pTime
AND t2.pTime = t3.pTime
AND t1.pTime >= 1104710000
AND t1.pTime <= 1367700000
)
ORDER BY
t1.pTime;
EXPLAIN EXTENDED:
+----+-------------+-------+--------+---------------+---------+---------+---------------+--------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+--------+----------+-----------------------------+
| 1 | SIMPLE | t1 | ALL | PRIMARY,Slope | NULL | NULL | NULL | 439340 | 25.79 | Using where; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY,Slope | PRIMARY | 4 | data.t1.pTime | 1 | 100.00 | Using where |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | data.t1.pTime | 1 | 100.00 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+---------------+--------+----------+-----------------------------+
SELECT
m.pTime AS OpenTime,
CASE WHEN m.STD < 0 THEN 1 ELSE - 1 END AS Type,
mu.pTime AS CloseTime;
FROM
MainList m
LEFT JOIN b50d1 mu ON mu.pTime =(
SELECT DISTINCT
md.pTime
FROM
b50d1 md
WHERE
md.pTime > m.pTime
AND md.pTime <= 1367700000
AND SIGN(md.STD)!= SIGN(m.STD)
AND ABS(md.STD)>= ABS(m.STD)
ORDER BY
md.pTime
LIMIT 1
);
EXPLAIN EXTENDED:
+----+--------------------+-------+--------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+--------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | PRIMARY | m | ALL | NULL | NULL | NULL | NULL | 16 | 100.00 | |
| 1 | PRIMARY | mu | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using index |
| 2 | DEPENDENT SUBQUERY | md | range | PRIMARY | PRIMARY | 4 | NULL | 439338 | 100.00 | Using where |
+----+--------------------+-------+--------+---------------+---------+---------+------+--------+----------+-------------+
查询工作并返回正确的结果,但数量级比我需要的慢。我认识到type: ALL
两个语句中出现的EXPLAIN
语句表明我的索引可能不是最理想的。
过去一周我只使用过MYSQL,我开始觉得自己好像在脑子里。我真的很感激一些帮助。
我创建了一个包含CREATE TABLE
和INSERT
语句的SQL文件,以便任何有兴趣尝试帮助我的人都可以在“Test”数据库中创建我的表的较小版本:
slowtables.SQL
为了完整性,这里是my.ini设置文件 - 也许它是一个瓶颈?
[client]
pipe
socket=mysql
[mysql]
default-character-set=latin1
[mysqld]
skip-networking
enable-named-pipe
socket=mysql
basedir="C:/Program Files/MySQL/MySQL Server 5.5/"
datadir="C:/ProgramData/MySQL/MySQL Server 5.5/Data/"
character-set-server=latin1
default-storage-engine=MYISAM
sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
max_connections=100
query_cache_size=189M
table_cache=256
tmp_table_size=192M
key_buffer_size=594M
read_buffer_size=64K
read_rnd_buffer_size=256K
sort_buffer_size=256K
答案 0 :(得分:1)
尝试使用您的第一个查询示例,并且所有表都有......正如您所提到的...完全相同的“pTime”值,我可能会将查询更改为以下内容...我不知道确定数学XOR是否比斜率* STD的直接乘法更快。如果只有一个可以是负数,那么唯一的结果就是负数,因为两个负数=正数(两个正数也是如此)......
但是,我已经将WHERE子句预先移动到显式限定查询时间范围,甚至在它尝试完成对表2和3的连接之前......
我不确定乘法与XOR调用,但我敢打赌,确实在较长时间内有一个因素。此外,试图先在斜坡上使用ABS()检查。话虽这么说,我会把它作为一个UNION来做,因为斜率也是表上的关键,利用它作为键的精确部分而不是通过ABS()中的评估函数。我可以做一个UNION ALL,因为一个标准是检查斜率为负,另一个标准为正,每个唯一的SQL将永远不会包含另一个的结果集。此外,我们可以抛弃异或检查,因为其余的AND子句明确地将斜率限定为std的反向符号。
那么,请考虑您对坡度和标准的其他考虑因素
INSERT INTO MainList(pTime,STD)
SELECT STRAIGHT_JOIN
t1.pTime,
t1.STD
FROM
b50d1 AS t1
JOIN b75d1 AS t2
ON t1.pTime = t2.pTime
JOIN b100d1 AS t3
ON t1.pTime = t3.pTime
where
t1.pTime >= 1104710000
AND t1.pTime <= 1367700000
AND t1.Slope >= 2.3169
AND t1.Slope <= 7.0031
AND t1.STD >= - 2.068
AND t1.STD <= - 0.972
AND t2.Slope >= 0.3179
AND t2.Slope <= 5.7221
AND t3.Slope >= 2.6466
AND t3.Slope <= 35.7534
ORDER BY
t1.pTime
UNION ALL
SELECT STRAIGHT_JOIN
t1.pTime,
t1.STD
FROM
b50d1 AS t1
JOIN b75d1 AS t2
ON t1.pTime = t2.pTime
JOIN b100d1 AS t3
ON t1.pTime = t3.pTime
where
t1.pTime >= 1104710000
AND t1.pTime <= 1367700000
AND t1.Slope >= - 7.0031
AND t1.Slope <= - 2.3169
AND t1.STD >= 0.972
AND t1.STD <= 2.068
AND t2.Slope >= - 5.7221
AND t2.Slope <= - 0.3179
AND t3.Slope >= - 35.7534
AND t3.Slope <= - 2.6466;
第三个版本是预先查询符合条件的条目,然后继续其余的连接...(内部,构建“PQ”PreQuery结果集)
INSERT INTO MainList(pTime,STD)
SELECT STRAIGHT_JOIN
pq.pTime,
pq.STD
FROM
( select
t1.pTime,
t1.slope,
t1.std
from
b50d1 t1
where
t1.pTime >= 1104710000
AND t1.pTime <= 1367700000
AND (( t1.slope between 2.3169 and 7.0031
AND t1.std between -2.068 and -.972 )
OR
( t1.slope between -7.0031 and -2.3169
AND t1.std between .972 and 2.068 )) ) PQ
JOIN b75d1 AS t2
ON p1.pTime = t2.pTime
JOIN b100d1 AS t3
ON p1.pTime = t3.pTime
where
( pq.slope > 0
AND t2.Slope >= 0.3179
AND t2.Slope <= 5.7221
AND t3.Slope >= 2.6466
AND t3.Slope <= 35.7534
)
OR
( pq.slope > 0
AND t2.Slope >= -5.7221
AND t2.Slope <= -0.3179
AND t3.Slope >= -35.7534
AND t3.Slope <= -2.6466
)
ORDER BY
t1.pTime
答案 1 :(得分:1)
我在这里看到两个可能与MySQL的优化器(或其弱点)有关的改进。在第二个查询中,给定LIMIT 1时,子查询中的DISTINCT是多余的。应该通过搜索索引来完成ORDER BY LIMIT 1查询,直到找到与其他条件匹配的记录。 (你真的需要LEFT JOIN ??)
在第一个查询中,MySQL显然无法将OR优化为UNION。但是,如果您手动执行此操作,它可能会为UNION查询的两半选择更好的计划。
HTH。我可以稍后再看。