我在使用错误查询计划的查询时遇到问题。由于非最佳查询计划,查询需要将近20秒。
问题仅发生在少量owner_ids上。 owner_ids的分布不统一。示例中的owner_id有7948条路由。路线总数为2903096。
数据库托管在具有34.2 GiB内存,4vCPU和预配置IOPS(实例类型为db.m2.2xlarge)的服务器上的Amazon RDS上。 Postgres版本是9.3.5。
EXPLAIN ANALYZE SELECT
route.id, route_meta.name
FROM
route
INNER JOIN
route_meta
USING (id)
WHERE
route.owner_id = 128905
ORDER BY
route_meta.name
LIMIT
61
Query plan:
"Limit (cost=0.86..58637.88 rows=61 width=24) (actual time=49.731..18828.052 rows=61 loops=1)"
" -> Nested Loop (cost=0.86..7934263.10 rows=8254 width=24) (actual time=49.728..18827.887 rows=61 loops=1)"
" -> Index Scan using route_meta_i_name on route_meta (cost=0.43..289911.22 rows=2902910 width=24) (actual time=0.016..2825.932 rows=1411126 loops=1)"
" -> Index Scan using route_pkey on route (cost=0.43..2.62 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=1411126)"
" Index Cond: (id = route_meta.id)"
" Filter: (owner_id = 128905)"
" Rows Removed by Filter: 1"
"Total runtime: 18828.214 ms"
如果我将限制增加到100,则使用更好的查询计划。它现在需要不到100毫秒。
EXPLAIN ANALYZE SELECT
route.id, route_meta.name
FROM
route
INNER JOIN
route_meta
USING (id)
WHERE
route.owner_id = 128905
ORDER BY
route_meta.name
LIMIT
100
Query plan:
"Limit (cost=79964.98..79965.23 rows=100 width=24) (actual time=93.037..93.294 rows=100 loops=1)"
" -> Sort (cost=79964.98..79985.61 rows=8254 width=24) (actual time=93.033..93.120 rows=100 loops=1)"
" Sort Key: route_meta.name"
" Sort Method: top-N heapsort Memory: 31kB"
" -> Nested Loop (cost=0.86..79649.52 rows=8254 width=24) (actual time=0.039..77.955 rows=7948 loops=1)"
" -> Index Scan using route_i_owner_id on route (cost=0.43..22765.84 rows=8408 width=4) (actual time=0.023..13.839 rows=7948 loops=1)"
" Index Cond: (owner_id = 128905)"
" -> Index Scan using route_meta_pkey on route_meta (cost=0.43..6.76 rows=1 width=24) (actual time=0.003..0.004 rows=1 loops=7948)"
" Index Cond: (id = route.id)"
"Total runtime: 93.444 ms"
我已经尝试过以下事项:
增加owner_id的统计信息(示例中的owner_id包含在pg_stats中)
ALTER TABLE route ALTER COLUMN owner_id SET STATISTICS 1000;
reindex owner_id和名称
真空分析
将work_mem从1MB增加到16MB
当我在子查询中将查询重写为row_number() OVER (ORDER BY xxx) AS rn
... WHERE rn <= yyy
时,具体案例就解决了。不过它
引入其他所有者的性能问题。
使用组合索引解决了类似的问题,但由于表格不同,这似乎不可能。 Postgres uses wrong index in query plan