我有两个执行相同操作的查询。 1
SELECT *
FROM "Products_product"
WHERE ("Products_product"."id" IN
(SELECT U0."product_id"
FROM "Products_purchase" U0
WHERE (U0."state" = 1
AND U0."user_id" = 5))
AND "Products_product"."state" IN (1,
6,
3)
AND UPPER("Products_product"."title" :: TEXT) LIKE UPPER('%toronto%'))
ORDER BY "Products_product"."title_index" ASC
LIMIT 10;
2
SELECT *
FROM "Products_product"
WHERE ("Products_product"."id" = ANY (ARRAY(
(SELECT U0."product_id"
FROM "Products_purchase" U0
WHERE (U0."state" = 1
AND U0."user_id" = 5))))
AND "Products_product"."state" IN (1,
6,
3)
AND UPPER("Products_product"."title" :: TEXT) LIKE UPPER('%toronto%'))
ORDER BY "Products_product"."title_index" ASC
LIMIT 10;
唯一的区别是第一个使用IN
进行子查询,而第二个使用=ANY(ARRAY())
。但是第二个比第一个快十倍。
经过解释,我得到了两个结果:
1
Limit (cost=5309.92..5309.93 rows=1 width=1906) (actual time=3414.185..3414.190 rows=10 loops=1)
-> Sort (cost=5309.92..5309.93 rows=1 width=1906) (actual time=3414.184..3414.185 rows=10 loops=1)
Sort Key: "Products_product".title
Sort Method: quicksort Memory: 57kB
-> Nested Loop Semi Join (cost=92.66..5309.91 rows=1 width=1906) (actual time=3385.153..3414.099 rows=16 loops=1)
-> Bitmap Heap Scan on "Products_product" (cost=13.85..256.32 rows=61 width=1906) (actual time=3381.327..3384.430 rows=63 loops=1)
Recheck Cond: ((state = ANY ('{1,6,3}'::integer[])) AND (upper((title)::text) ~~ '%TORONTO%'::text))
Rows Removed by Index Recheck: 1
Heap Blocks: exact=64
-> Bitmap Index Scan on "Products_product_state_id_upper_idx" (cost=0.00..13.83 rows=61 width=0) (actual time=3381.001..3381.001 rows=64 loops=1)
Index Cond: ((state = ANY ('{1,6,3}'::integer[])) AND (upper((title)::text) ~~ '%TORONTO%'::text))
-> Bitmap Heap Scan on "Products_purchase" u0 (cost=78.82..82.84 rows=1 width=4) (actual time=0.467..0.467 rows=0 loops=63)
Recheck Cond: ((product_id = "Products_product".id) AND (user_id = 5))
Filter: (state = 1)
Heap Blocks: exact=16
-> BitmapAnd (cost=78.82..78.82 rows=1 width=0) (actual time=0.465..0.465 rows=0 loops=63)
-> Bitmap Index Scan on "Products_purchase_product_id" (cost=0.00..5.06 rows=84 width=0) (actual time=0.265..0.265 rows=30 loops=63)
Index Cond: (product_id = "Products_product".id)
-> Bitmap Index Scan on "Products_purchase_user_id" (cost=0.00..72.57 rows=3752 width=0) (actual time=0.242..0.242 rows=3335 loops=51)
Index Cond: (user_id = 5)
Planning time: 7.540 ms
Execution time: 3414.356 ms
(22 rows)
2
Limit (cost=7378.07..7378.07 rows=1 width=1906) (actual time=116.559..116.562 rows=10 loops=1)
InitPlan 1 (returns $0)
-> Index Scan using "Products_purchase_user_id" on "Products_purchase" u0 (cost=0.43..7329.83 rows=3752 width=4) (actual time=0.021..15.535 rows=3335 loops=1)
Index Cond: (user_id = 5)
Filter: (state = 1)
-> Sort (cost=48.24..48.25 rows=1 width=1906) (actual time=116.558..116.559 rows=10 loops=1)
Sort Key: "Products_product".title
Sort Method: quicksort Memory: 57kB
-> Bitmap Heap Scan on "Products_product" (cost=44.20..48.23 rows=1 width=1906) (actual time=116.202..116.536 rows=16 loops=1)
Recheck Cond: ((id = ANY ($0)) AND (upper((title)::text) ~~ '%TORONTO%'::text))
Filter: (state = ANY ('{1,6,3}'::integer[]))
Rows Removed by Filter: 2
Heap Blocks: exact=18
-> Bitmap Index Scan on "Products_product_id_upper_idx1" (cost=0.00..44.20 rows=1 width=0) (actual time=116.103..116.103 rows=18 loops=1)
Index Cond: ((id = ANY ($0)) AND (upper((title)::text) ~~ '%TORONTO%'::text))
Planning time: 1.054 ms
Execution time: 116.663 ms
(17 rows)
在文档中,IN
或ANY
之间没有实质性区别。但是为什么我得到如此不同的结果。在任何情况下,ANY
都比IN
有优势吗?
更新: 有人指出这个问题可能与IN vs ANY operator in PostgreSQL重复。 它们是相同的问题,但是该问题的答案并不能解决我的问题,因为除了该答案之外,我还有一个更详细的案例。
但是每个的第二个变体并不等同于另一个。的 ANY构造的第二个变体采用数组(必须是实际的 数组类型),而IN的第二个变体则以逗号分隔 值列表。这导致传递值的不同限制 并且在特殊情况下还会导致不同的查询计划:
在我的问题中,任何一个问题都不是这样。我只是传递一个数组作为子查询。我的情况与第一个网址完全相反。我的索引仅在ANY
中使用,而不在IN
中使用。所以基本上,这个答案并不能解决我的问题。
UPDATE2:
我更新了索引:CREATE INDEX ON "Products_product" USING GIST (state, id, upper((title) :: TEXT) gist_trgm_ops);
。我可以确认两个查询的情况相同,这意味着索引存在于此,但第一个不使用它。
UPDATE3:
我只是在代码中删除了ARRAY
。但是结果是一样的。
explain analyze SELECT *
FROM "Products_product"
WHERE ("Products_product"."id" = ANY(
(SELECT U0."product_id"
FROM "Products_purchase" U0
WHERE (U0."state" = 1
AND U0."user_id" = 5)))
AND "Products_product"."state" IN (1,
6,
3)
AND UPPER("Products_product"."title" :: TEXT) LIKE UPPER('%toronto%'))
ORDER BY "Products_product"."title" ASC
LIMIT 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5309.92..5309.93 rows=1 width=1906) (actual time=228.980..228.983 rows=10 loops=1)
-> Sort (cost=5309.92..5309.93 rows=1 width=1906) (actual time=228.979..228.980 rows=10 loops=1)
Sort Key: "Products_product".title
Sort Method: quicksort Memory: 57kB
-> Nested Loop Semi Join (cost=92.66..5309.91 rows=1 width=1906) (actual time=216.392..228.913 rows=16 loops=1)
-> Bitmap Heap Scan on "Products_product" (cost=13.85..256.32 rows=61 width=1906) (actual time=214.332..215.260 rows=63 loops=1)
Recheck Cond: ((state = ANY ('{1,6,3}'::integer[])) AND (upper((title)::text) ~~ '%TORONTO%'::text))
Rows Removed by Index Recheck: 1
Heap Blocks: exact=64
-> Bitmap Index Scan on "Products_product_state_id_upper_idx" (cost=0.00..13.83 rows=61 width=0) (actual time=214.296..214.296 rows=64 loops=1)
Index Cond: ((state = ANY ('{1,6,3}'::integer[])) AND (upper((title)::text) ~~ '%TORONTO%'::text))
-> Bitmap Heap Scan on "Products_purchase" u0 (cost=78.82..82.84 rows=1 width=4) (actual time=0.215..0.215 rows=0 loops=63)
Recheck Cond: ((product_id = "Products_product".id) AND (user_id = 5))
Filter: (state = 1)
Heap Blocks: exact=16
-> BitmapAnd (cost=78.82..78.82 rows=1 width=0) (actual time=0.212..0.212 rows=0 loops=63)
-> Bitmap Index Scan on "Products_purchase_product_id" (cost=0.00..5.06 rows=84 width=0) (actual time=0.017..0.017 rows=30 loops=63)
Index Cond: (product_id = "Products_product".id)
-> Bitmap Index Scan on "Products_purchase_user_id" (cost=0.00..72.57 rows=3752 width=0) (actual time=0.239..0.239 rows=3335 loops=51)
Index Cond: (user_id = 5)
Planning time: 5.083 ms
Execution time: 229.904 ms
(22 rows)
我认为ANY
或ANY(ARRAY())
并非如此,只是IN
和ANY
之间的区别
答案 0 :(得分:0)
不同的计划不是由IN
与= ANY
引起的,而是由第二个查询中子选择周围的附加ARRAY()
引起的。没有这些,计划是相同的。
区别在于,在执行缓慢的情况下,索引扫描会花费很长时间,而在您进行编辑的(完全相同)计划中,相同的扫描会很快:
慢:
-> Bitmap Index Scan on "Products_product_state_id_upper_idx" (cost=0.00..13.83 rows=61 width=0) (actual time=3381.001..3381.001 rows=64 loops=1)
Index Cond: ((state = ANY ('{1,6,3}'::integer[])) AND (upper((title)::text) ~~ '%TORONTO%'::text))
快速:
-> Bitmap Index Scan on "Products_product_state_id_upper_idx" (cost=0.00..13.83 rows=61 width=0) (actual time=214.296..214.296 rows=64 loops=1)
Index Cond: ((state = ANY ('{1,6,3}'::integer[])) AND (upper((title)::text) ~~ '%TORONTO%'::text))
有趣的是,在缓慢的计划中,花了3秒钟才产生索引扫描的第一行。
无论问题出在哪里,它都是暂时的。唯一想到的是killed index tuples:大规模删除留下了许多索引元组,这些索引元组指向死堆元组,这些元组只需要第一次扫描,因为在此之后它们被标记为已死。
您有批量删除吗?