我试图在某个时间戳之前找到仅存在的源代码。这个查询对于这项工作来说似乎很差。知道如何优化或可能改进的索引吗?
select distinct sourcesite
from contentmeta
where timestamp <= '2011-03-15'
and sourcesite not in (
select distinct sourcesite
from contentmeta
where timestamp>'2011-03-15'
);
源网站和时间戳上有一个索引,但查询仍需要很长时间
mysql> EXPLAIN select distinct sourcesite from contentmeta where timestamp <= '2011-03-15' and sourcesite not in (select distinct sourcesite from contentmeta where timestamp>'2011-03-15');
+----+--------------------+-------------+----------------+---------------+----------+---------+------+--------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+----------------+---------------+----------+---------+------+--------+-------------------------------------------------+
| 1 | PRIMARY | contentmeta | index | NULL | sitetime | 14 | NULL | 725697 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | contentmeta | index_subquery | sitetime | sitetime | 5 | func | 48 | Using index; Using where; Full scan on NULL key |
+----+--------------------+-------------+----------------+---------------+----------+---------+------+--------+-------------------------------------------------+
答案 0 :(得分:3)
这应该有效:
SELECT DISTINCT c1.sourcesite
FROM contentmeta c1
LEFT JOIN contentmeta c2
ON c2.sourcesite = c1.sourcesite
AND c2.timestamp > '2011-03-15'
WHERE c1.timestamp <= '2011-03-15'
AND c2.sourcesite IS NULL
为获得最佳效果,请在contentmeta(sourcesite
,timestamp
)上设置多列索引。
通常,连接比子查询执行得更好,因为派生表不能使用索引。
答案 1 :(得分:3)
子查询不需要DISTINCT,也不需要外部查询的WHERE子句,因为您已经通过NOT IN进行过滤。
尝试:
select distinct sourcesite
from contentmeta
where sourcesite not in (
select sourcesite
from contentmeta
where timestamp > '2011-03-15'
);
答案 2 :(得分:1)
我发现“不在”只是不能很好地优化许多数据库。改为使用left outer join
:
select distinct sourcesite
from contentmeta cm
left outer join
(
select distinct sourcesite
from contentmeta
where timestamp>'2011-03-15'
) t
on cm.sourcesite = t.sourcesite
where timestamp <= '2011-03-15' and t.sourcesite is null
这假定sourcesite
永远不会为空。