我有一个大型的mysql表,大约有110.000.000个项目
表设计是:
CREATE TABLE IF NOT EXISTS `tracksim` (
`tracksimID` int(11) NOT NULL AUTO_INCREMENT,
`trackID1` int(11) NOT NULL,
`trackID2` int(11) NOT NULL,
`sim` double NOT NULL,
PRIMARY KEY (`tracksimID`),
UNIQUE KEY `TrackID1` (`trackID1`,`trackID2`),
KEY `sim` (`sim`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
现在我想查询普通查询:
SELECT trackID1, trackID2 FROM `tracksim`
WHERE sim > 0.5 AND
(`trackID1` = 168123 OR `trackID2`= 168123)
ORDER BY sim DESC LIMIT 0,100
Explain声明给了我:
+----+-------------+----------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | tracksim | range | TrackID1,sim | sim | 8 | NULL | 19980582 | 100.00 | Using where |
+----+-------------+----------+-------+---------------+------+---------+------+----------+----------+-------------+
查询似乎非常慢(大约185秒),但我不知道是否只是因为表中的项目数量。您是否有提示如何加快查询或查表?
答案 0 :(得分:3)
有1.1亿条记录,我无法想象有很多条目存在问题。我会有像
这样的索引(trackID1, sim )
(trackID2, sim )
(tracksimID, sim)
并通过联合执行PREQUERY并加入该结果
select STRAIGHT_JOIN
TS2.*
from
( select ts.tracksimID
from tracksim ts
where ts.trackID1 = 168123
and ts.sim > 0.5
UNION
select ts.trackSimID
from tracksim ts
where ts.trackid2 = 168123
and ts.sim > 0.5
) PreQuery
JOIN TrackSim TS2
on PreQuery.TrackSimID = TS2.TrackSimID
order by
TS2.SIM DESC
LIMIT 0, 100
答案 1 :(得分:2)
大多数情况下我同意Drap,但查询的以下变体可能更有效,特别是对于更大的LIMIT:
SELECT TS2.*
FROM (
SELECT tracksimID, sim
FROM tracksim
WHERE trackID1 = 168123
AND sim > 0.5
UNION
SELECT trackSimID, sim
FROM tracksim
WHERE trackid2 = 168123
AND ts.sim > 0.5
ORDER BY sim DESC
LIMIT 0, 100
) as PreQuery
JOIN TrackSim TS2 USING (TrackSimID);
需要(trackID1, sim)
和(trackID2, sim)
个索引。
答案 2 :(得分:0)
尝试过滤您的查询,这样就不会返回完整的表格。或者,您可以尝试在其中一个轨道ID上对表应用索引,例如:
CREATE INDEX TRACK_INDEX
ON tracksim (trackID1)