Question

我有以下Postgres查询，查询需要10到50秒才能执行。

SELECT m.match_id FROM match m
WHERE m.match_id NOT IN(SELECT ml.match_id FROM message_log ml)
AND m.account_id = ?

我在match_id和account_id

上创建了一个索引

CREATE INDEX match_match_id_account_id_idx ON match USING btree
  (match_id COLLATE pg_catalog."default",
   account_id COLLATE pg_catalog."default");

但是查询还需要很长时间。我该怎么做才能加快速度并提高效率？当我执行一些查询时，我的服务器负载变为25。

Answer 1

NOT IN (SELECT ... )可能要贵得多，因为它必须单独处理NULL。当涉及NULL值时，它也可能很棘手。通常LEFT JOIN / IS NULL（或其他相关技术之一）更快：

Select rows which are not present in other table

应用于您的查询：

SELECT m.match_id
FROM   match m 
LEFT   JOIN message_log ml USING (match_id)
WHERE  ml.match_id IS NULL
AND    m.account_id = ?;

最佳指数是：

CREATE INDEX match_match_id_account_id_idx ON match (account_id, match_id);

或仅在(account_id)上，假设match_id在两个表中都是PK。您还已在message_log(match_id)上拥有所需的索引。否则也要创造它。

您的索引定义中的COLLATE pg_catalog."default"也表示您的ID列为character types，通常 效率低下 。通常最好是integer types。

我从你到目前为止所展示的那些小小的猜测：可能还有更多的问题。

有什么方法可以加速这个SQL查询？

1 个答案: