我需要帮助删除大量具有分区的ID的大量特定行(大约10k ID)。
我运行以下查询来分析只有2个ID的执行计划:
explain analyze delete from scheme.users_daily
where user_id in ( -- user_id is not PK, can be duplicated
5791001,
7779001
)
实际上,我应该删除大约10k个ID。我想这大约是表中所有数据的1-3%。
并收到以下结果:
"Delete on users_daily (cost=0.00..8573700.00 rows=3627 width=6) (actual time=10271.578..10271.578 rows=0 loops=1)"
" -> Seq Scan on users_daily (cost=0.00..0.00 rows=1 width=6) (actual time=0.001..0.001 rows=0 loops=1)"
" Filter: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2013_unq on users_daily_y2013 (cost=0.57..2351216.55 rows=1556 width=6) (actual time=2792.469..2792.469 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2014_unq on users_daily_y2014 (cost=0.56..1745517.74 rows=694 width=6) (actual time=2158.849..2158.849 rows=0 loops=1)"
" Index Cond: (user_id = ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2015hy1_unq on users_daily_y2015hy1 (cost=0.56..1039034.28 rows=349 width=6) (actual time=1224.623..1224.623 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2015hy2_unq on users_daily_y2015hy2 (cost=0.56..1159715.43 rows=375 width=6) (actual time=1380.513..1380.513 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2016_unq on users_daily_y2016 (cost=0.57..2278216.00 rows=652 width=6) (actual time=2715.106..2715.106 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"Planning time: 0.364 ms"
然后我检查当前表和分区中的行数:
explain analyze select count(id)
from scheme.users_daily
响应:
"Aggregate (cost=9735664.99..9735665.00 rows=1 width=8) (actual time=1005691.857..1005691.858 rows=1 loops=1)"
" -> Append (cost=0.00..9032279.19 rows=281354320 width=8) (actual time=1.967..797587.379 rows=268084881 loops=1)"
" -> Seq Scan on users_daily (cost=0.00..0.00 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)"
" -> Seq Scan on users_daily_y2013 (cost=0.00..2489084.32 rows=77247432 width=8) (actual time=1.957..134821.451 rows=71962946 loops=1)"
" -> Seq Scan on users_daily_y2014 (cost=0.00..1848731.32 rows=57374432 width=8) (actual time=1.256..72915.919 rows=54835860 loops=1)"
" -> Seq Scan on users_daily_y2015hy1 (cost=0.00..1094813.48 rows=34156048 width=8) (actual time=0.735..40167.938 rows=32796038 loops=1)"
" -> Seq Scan on users_daily_y2015hy2 (cost=0.00..1219911.76 rows=38124076 width=8) (actual time=0.801..43994.913 rows=36669043 loops=1)"
" -> Seq Scan on users_daily_y2016 (cost=0.00..2379738.31 rows=74452331 width=8) (actual time=0.009..109012.004 rows=71820994 loops=1)"
"Planning time: 0.688 ms"
"Execution time: 1005691.977 ms"
如何通过从大表中删除大量行来解决此性能问题?