Question

我有一个有效的查询，但这是永远的。它基本上返回数据库D中表T中的所有行（带有连接条件）而不是数据库D2中的表T2（也带有连接条件）：

select f.* from D.frames f
join D.items i on i.id=f.item_id
where f.item_id not in (
    select f2.item_id
    from D2.frames f2
    join D2.items i2 on i2.id=f2.item_id
    where i2.primary_type='xxx'
)

非常感谢任何有关加快此查询的帮助。

Answer 1

您的一个问题可能是跨数据库NOT IN没有作为常量集执行。

尝试SELECT从D2进入临时表（因此你有一个固定的内存集），然后将该临时表用于NOT IN子句。

更好的是，您可以LEFT JOIN临时表并检查该字段上的NULL。

未经测试：

CREATE TEMPORARY TABLE TempTable (item_id int); 
INSERT INTO 
    TempTable 
FROM 
    D2.frames f2
JOIN 
    D2.items i2 ON i2.id=f2.item_id
WHERE 
    i2.primary_type='xxx'; 

SELECT 
    f.* 
FROM 
    D.frames f
JOIN 
    D.items i ON i.id=f.item_id
LEFT OUTER JOIN
    TempTable t ON t.item_id = f.item_id
WHERE
    t.item_id IS NULL

Answer 2

据我所知，当你使用where aField in (select...)时，子查询会对表中的每一行进行一次评估，所以这确实是一个很大的性能损失。

我建议您使用left join：

select 
    f.*
from 
    (D.frames as f
    inner join D.items as i on f.item_id = i.id)
    left join (
        select f2.item_id
        from D2.frames as f2
        inner join D2.items as i2 on f2.item_id = i2.id
        where i2.primary_type='xxx'
    ) as a on f.item_id = a.item_id
where a.item_id is null;

作为替代方案，请考虑创建一个临时表，并将其分为两个步骤：

-- Step 1. Create the temporary table with the ids you need
drop table if exist temp_myTable;
create temporary table temp_myTable
    select f2.item_id
    from D2.frames as f2
    inner join D2.items as i2 on f2.item_id = i2.id
    where i2.primary_type='xxx';
-- Step 2. Add the appropriate indexes
alter temporary table temp_myTable
    add primary key (item_id); -- Or add index idx_item_id(item_id)
-- Step 3. Run your query using the newly created temp table
select 
    f.*
from 
    (D.frames as f
    inner join D.items as i on f.item_id = i.id)
    left join temp_myTable as a on f.item_id = a.item_id
where a.item_id is null;

希望这有帮助

SQL查询：需要加快针对两个数据库进行区分的查询

2 个答案: