Question

我有一个表users，其索引位于id。

我有另一个表orders，该表属于users并且有一个user_id字段，该字段也已编入索引

我有以下查询：

SELECT * FROM users
WHERE id IN (
    SELECT user_id FROM orders
    WHERE user_id = '4CF93940-390D-4D70-BE62-61AFC73663BF'
);

这里Postgres正确使用两个索引（通过检查EXPLAIN的输出）。

但是，如果我这样做：

SELECT * FROM users
WHERE id IN (
    SELECT user_id FROM orders
);

（从内部选择中删除了where子句）

它根本不使用任何索引。它没有使用orders.user_id上的索引，因为我选择了所有索引，但为什么它不使用users.id上的索引？

以下是两个查询的解释输出：

Nested Loop Semi Join  (cost=4.36..22.54 rows=1 width=16)
   ->  Index Only Scan using user_id on users  (cost=0.15..8.17 rows=1 width=16)
         Index Cond: (id = '4cf93940-390d-4d70-be62-61afc73663bf'::uuid)
   ->  Bitmap Heap Scan on orders  (cost=4.21..14.35 rows=7 width=16)
         Recheck Cond: (user_id = '4cf93940-390d-4d70-be62-61afc73663bf'::uuid)
         ->  Bitmap Index Scan on order_user_id  (cost=0.00..4.21 rows=7 width=0)
               Index Cond: (user_id = '4cf93940-390d-4d70-be62-61afc73663bf'::uuid)



Hash Join  (cost=30.88..67.21 rows=885 width=16)
    Hash Cond: (users.id = orders.user_id)
    ->  Seq Scan on users  (cost=0.00..27.70 rows=1770 width=16)
    ->  Hash  (cost=28.38..28.38 rows=200 width=16)
        ->  HashAggregate  (cost=26.38..28.38 rows=200 width=16)
                Group Key: orders.user_id
                ->  Seq Scan on orders  (cost=0.00..23.10 rows=1310 width=16)

更新：

所以运行此查询：

EXPLAIN SELECT * FROM users
WHERE id IN ('4CF93940-390D-4D70-BE62-61AFC73663BF', '4CF93940-390D-4D70-BE62-61AFC73663BF');

正确使用索引。所以基本上为什么是

WHERE id IN ('4CF93940-390D-4D70-BE62-61AFC73663BF', '4CF93940-390D-4D70-BE62-61AFC73663BF')

与

不同

WHERE id IN (SELECT user_id FROM orders);

似乎两者都只是一组ID。

Answer 1

相对于子选择返回的条目数，users中有更多条目，Postgres可能会开始使用索引。

此外，当Postgres能够将构造视为连接时，id in (list)可能与id in (sub-select)非常不同（就像EXPLAIN编辑的那样）

Postgres没有在UUID的“IN”子句中使用索引

1 个答案: