我有一个查询
SELECT DISTINCT phoneNum
FROM `Transaction_Register`
WHERE phoneNum NOT IN (SELECT phoneNum FROM `Subscription`)
LIMIT 0 , 1000000
执行b / c需要太多时间Transaction_Register
表有数百万条记录
有没有上述查询的替代方案我会感激你们,如果有的话。
答案 0 :(得分:15)
另一种方法是使用LEFT JOIN:
select distinct t.phoneNum
from Transaction_Register t
left join Subscription s
on t.phoneNum = s.phoneNum
where s.phoneNum is null
LIMIT 0 , 1000000;
答案 1 :(得分:3)
我怀疑LEFT JOIN
是否真的比NOT IN
表现得更好。我只是用下面的表结构执行一些测试(如果我错了请纠正我):
account (id, ....) [42,884 rows, index by id]
play (account_id, playdate, ...) [61,737 rows, index by account_id]
(1)使用LEFT JOIN
SELECT * FROM
account LEFT JOIN play ON account.id = play.account_id
WHERE play.account_id IS NULL
(2)使用NOT IN
SELECT * FROM
account WHERE
account.id NOT IN (SELECT play.account_id FROM play)
使用LIMIT 0进行速度测试,......
LIMIT 0,-> 100 150 200 250
-------------------------------------------------------------------------
LEFT 3.213s 4.477s 5.881s 7.472s
NOT EXIST 2.200s 3.261s 4.320s 5.647s
--------------------------------------------------------------------------
Difference 1.013s 1.216s 1.560s 1.825s
随着我增加限制,差异越来越大
EXPLAIN
(1)使用LEFT JOIN
SELECT_TYPE TABLE TYPE ROWS EXTRA
-------------------------------------------------
SIMPLE account ALL 42,884
SIMPLE play ALL 61,737 Using where; not exists
(2)使用NOT IN
SELECT_TYPE TABLE TYPE ROWS EXTRA
-------------------------------------------------
SIMPLE account ALL 42,884 Using where
DEPENDENT SUBQUERY play INDEX 61,737 Using where; Using index
好像LEFT JOIN没有使用索引
(1)使用LEFT JOIN
在账户和游戏之间的LEFT JOIN之后将产生42,884 * 61,737 = 2,647,529,508行。然后检查这些行上的play.account_id是否为NULL。
(2)使用NOT IN
二进制搜索需要log2(N)才能存在项目。这意味着42,884 * log2(61,737)= 686,144步