我有这个选择:
SELECT count(*) AS y0_
FROM erc.SUBJECTS this_
LEFT OUTER JOIN fias.FIAS_HOUSE factaddres4_ ON this_.FACTADDRESS_REF = factaddres4_.houseId
LEFT OUTER JOIN fias.FIAS_AGREGATE_ADDRESS factaddres5_ ON factaddres4_.houseId = factaddres5_.HOUSEID
LEFT OUTER JOIN erc.REFITEMS okopf_1_ ON this_.OKOPF_REF = okopf_1_.ID
WHERE this_.IS_ACTUAL = 1 AND this_.IS_DELETE <> 1 AND NOT okopf_1_.CODE LIKE '5%' AND NOT okopf_1_.CODE = '0'
它运行了将近18秒。
主题表有376k行,fias_house有2100万行,fias_agregate_address - 130。 解释分析结果:Aggregate (cost=1061561.33..1061561.34 rows=1 width=4) (actual time=17813.460..17813.460 rows=1 loops=1)
-> Hash Left Join (cost=106687.31..1060683.61 rows=351088 width=4) (actual time=763.556..17741.820 rows=376196 loops=1)
Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
-> Hash Join (cost=106679.25..1059358.95 rows=351088 width=41) (actual time=760.772..17599.742 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Merge Right Join (cost=106599.85..1053887.84 rows=376166 width=45) (actual time=759.211..17411.313 rows=376254 loops=1)
Merge Cond: ((factaddres4_.houseid)::text = (this_.factaddress_ref)::text)
-> Index Only Scan using fias_house_pkey on fias_house factaddres4_ (cost=0.56..924229.05 rows=21084566 width=37) (actual time=0.013..8528.487 rows=19627484 loops=1)
Heap Fetches: 0
-> Materialize (cost=74125.25..76006.08 rows=376166 width=45) (actual time=759.171..980.286 rows=376254 loops=1)
-> Sort (cost=74125.25..75065.67 rows=376166 width=45) (actual time=759.167..863.495 rows=376254 loops=1)
Sort Key: this_.factaddress_ref
Sort Method: external sort Disk: 6616kB
-> Seq Scan on subjects this_ (cost=0.00..27715.88 rows=376166 width=45) (actual time=0.790..591.380 rows=376254 loops=1)
Filter: ((is_delete <> 1) AND (is_actual = 1))
Rows Removed by Filter: 138
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.522..1.522 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.019..0.930 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
-> Hash (cost=6.36..6.36 rows=136 width=37) (actual time=2.761..2.761 rows=136 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on fias_agregate_address factaddres5_ (cost=0.00..6.36 rows=136 width=37) (actual time=1.477..2.696 rows=136 loops=1)
Total runtime: 17814.728 ms
无需加入FIAS_AGREGATE_ADDRESS请求即可完成更长时间。解释分析结果:
Aggregate (cost=34066.40..34066.41 rows=1 width=4) (actual time=510.291..510.292 rows=1 loops=1)
-> Hash Join (cost=79.40..33188.44 rows=351183 width=4) (actual time=1.573..442.526 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Seq Scan on subjects this_ (cost=0.00..27715.88 rows=376267 width=45) (actual time=0.144..248.430 rows=376254 loops=1)
Filter: ((is_delete <> 1) AND (is_actual = 1))
Rows Removed by Filter: 138
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.415..1.415 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.007..0.844 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
Total runtime: 510.367 ms
我找到了这篇文章:https://wiki.postgresql.org/wiki/Slow_Counting 但我无法使用这些建议,因为搜索条件可能会有所不同。
我也不能只丢掉FIAS_AGREGATE_ADDRESS加入,因为该表上可能存在搜索条件。
也许会有一些聪明的指数或其他机会,因为疲倦和愚蠢而错过了?
UPD:将work_mem从8MB增加到16后解释分析结果变为:
Aggregate (cost=1018467.07..1018467.08 rows=1 width=4) (actual time=18615.975..18615.975 rows=1 loops=1)
-> Hash Left Join (cost=810328.24..1017589.11 rows=351183 width=4) (actual time=3.609..18543.596 rows=376196 loops=1)
Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
-> Hash Join (cost=810320.18..1016264.10 rows=351183 width=41) (actual time=2.190..18400.383 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Merge Left Join (cost=810240.78..1010791.53 rows=376267 width=45) (actual time=0.838..18203.533 rows=376254 loops=1)
Merge Cond: ((this_.factaddress_ref)::text = (factaddres4_.houseid)::text)
-> Index Scan using idx_subjects_factaddress_ref_btree on subjects this_ (cost=0.42..32907.70 rows=376267 width=45) (actual time=0.805..701.428 rows=376254 loops=1)
-> Index Only Scan using fias_house_pkey on fias_house factaddres4_ (cost=0.56..924231.15 rows=21084706 width=37) (actual time=0.013..8885.002 rows=19627486 loops=1)
Heap Fetches: 0
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.307..1.307 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.010..0.802 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
-> Hash (cost=6.36..6.36 rows=136 width=37) (actual time=1.396..1.396 rows=136 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on fias_agregate_address factaddres5_ (cost=0.00..6.36 rows=136 width=37) (actual time=0.782..1.323 rows=136 loops=1)
Total runtime: 18616.060 ms
“排序”行消失了,但请求时间确实没有受到影响。
每次加入都有外键。映射列到处都是私钥。我的意思是,例如,SUBJECTS表有FK:OKOPF_REF-&gt; REFITEMS.ID,ID是REFITEMS中的私钥列。
以下是这些表的ddl(包括索引)的链接:https://yadi.sk/d/-OxGh5BDdy4XW。
我发布了修剪查询以获得更好的分析,但是可能存在不同的搜索条件,例如在不同的表中搜索子字符串。我有这种最坏的情况:对于简单的搜索字符串(如'123'),有所有连接(搜索应该在所有表上执行),但仍然计数结果非常大。因此,我不能省略那些左连接。