Question

在我的开发服务器上，测试事务（一系列更新等）运行大约2分钟。在生产服务器上大约需要25分钟。

服务器读取文件并插入记录。它开始时很快，但是随着过程的进行变得越来越慢。每个插入的记录都有一个汇总表更新，并且该更新会逐渐减慢速度。该汇总更新的确会查询带有插入内容的表。

配置仅在max_worker_processes（开发8，产品16），shared_buffers（开发128MB，产品512MB），wal_buffers（开发4MB，产品16MB）方面有所不同。

我尝试调整一些配置，并转储了整个数据库并重新设置了initdb以防万一它没有正确升级（到9.6）。什么都没做。

我希望有经验的人可以告诉我要寻找什么。

编辑：收到一些评论后，我能够弄清楚发生了什么，并进行了工作，但是我认为必须有更好的方法。首先，这是怎么回事：

首先，表中没有相关索引的数据，postgresql制定了该计划。请注意，表中没有任何带有相关“ businessIdentifier”索引或“ transactionNumber”索引的数据。

 Aggregate  (cost=16.63..16.64 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16.63 rows=1 width=4) (actual time=0.028..0.028 rows=0 loops=1)
         ->  Index Scan using transactionlinedateindex on "transactionLine" ed  (cost=0.29..8.31 rows=1 width=5) (actual time=0.028..0.028 rows=0 loops=1)
               Index Cond: ((("businessIdentifier")::text = '36'::text) AND ("reconciliationNumber" = 4519))
         ->  Index Scan using transaction_pkey on transaction eh  (cost=0.29..8.31 rows=1 width=9) (never executed)
               Index Cond: ((("businessIdentifier")::text = '36'::text) AND (("transactionNumber")::text = (ed."transactionNumber")::text))
               Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
 Planning time: 0.915 ms
 Execution time: 0.100 ms

然后，当数据被插入时，这将成为一个非常糟糕的计划。在此示例中为474毫秒。它需要执行数千次，具体取决于上传的内容，因此474ms不好。

 Aggregate  (cost=16.44..16.45 rows=1 width=4) (actual time=474.222..474.222 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16.44 rows=1 width=4) (actual time=474.218..474.218 rows=0 loops=1)
         Join Filter: ((eh."transactionNumber")::text = (ed."transactionNumber")::text)
         ->  Index Scan using transaction_pkey on transaction eh  (cost=0.29..8.11 rows=1 width=9) (actual time=0.023..0.408 rows=507 loops=1)
               Index Cond: (("businessIdentifier")::text = '37'::text)
               Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
         ->  Index Scan using transactionlineprovdateindex on "transactionLine" ed  (cost=0.29..8.31 rows=1 width=5) (actual time=0.934..0.934 rows=0 loops=507)
               Index Cond: (("businessIdentifier")::text = '37'::text)
               Filter: ("reconciliationNumber" = 4519)
               Rows Removed by Filter: 2520
 Planning time: 0.848 ms
 Execution time: 474.278 ms

真空分析将其修复。但是，直到提交事务后才能运行真空分析。在Vacuum分析之后，postgresql使用了不同的计划，并且回溯到0.1毫秒。

 Aggregate  (cost=16.63..16.64 rows=1 width=4) (actual time=0.072..0.072 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16.63 rows=1 width=4) (actual time=0.069..0.069 rows=0 loops=1)
         ->  Index Scan using transactionlinedateindex on "transactionLine" ed  (cost=0.29..8.31 rows=1 width=5) (actual time=0.067..0.067 rows=0 loops=1)
               Index Cond: ((("businessIdentifier")::text = '37'::text) AND ("reconciliationNumber" = 4519))
         ->  Index Scan using transaction_pkey on transaction eh  (cost=0.29..8.31 rows=1 width=9) (never executed)
               Index Cond: ((("businessIdentifier")::text = '37'::text) AND (("transactionNumber")::text = (ed."transactionNumber")::text))
               Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
 Planning time: 1.134 ms
 Execution time: 0.141 ms

我的解决方法是在插入约100次后提交，然后运行Vacuum分析，然后继续。唯一的问题是，如果其余数据中的某些内容失败并回滚，则仍然会插入100条记录。

是否有更好的方法来处理？我应该只升级到版本10或11还是Postgresql，对您有帮助吗？

Answer 1

每个插入的记录都有一个汇总表更新，并且该更新会逐渐减慢速度。

这里是一个主意：将工作流程更改为（1）使用COPY界面将外部数据导入表中；（2）对数据进行索引和分析；（3）使用所有必需的联接/分组运行最终的UPDATE实际转换并更新汇总表。

所有这些操作都可以在很长的时间内完成-如果需要的话。

仅当整个事物锁定某些重要数据库对象的时间过长时，您才应考虑将其拆分为单独的事务/批处理（按日期/时间或ID以某种通用方式对数据进行分区）。

但是，直到提交事务之后，您才能运行Vacuum分析。

要获取更新的查询计划费用，您只需ANALYZE，而无需VACUUM。

生产服务器上的交易速度降低了20倍

1 个答案: