Question

为什么多次执行相同查询的响应时间差异很大？从预计响应时间的50％到200％？它们的范围从6秒到20秒，即使它是数据库中唯一的活动查询。

上下文：

AWS RDS上Postgres 9.6上的数据库（带有预配置IOPS）
包含一个包含五个数字列的表，以id为索引，持有2亿行

查询：

SELECT col1, col2 FROM calculations WHERE id > 0 AND id < 100000;

查询解释计划：

Bitmap Heap Scan on calculation (cost=2419.37..310549.65 rows=99005 width=43) Recheck Cond: ((id > 0) AND (id <= 100000)) -> Bitmap Index Scan on calculation_pkey (cost=0.00..2394.62 rows=99005 width=0) Index Cond: ((id > 0) AND (id <= 100000))

为什么像这样的简单查询在响应时间内更难以预测，是否有任何理由？

感谢。

Answer 1

当你在PostgreSQL EXPLAIN ANALYZE中看到类似的东西时：

(cost=2419.37..310549.65)

......这并不意味着成本介于2419.37和310549.65之间。这实际上是两种不同的措施。第一个值是启动成本，第二个值是总成本。大多数时候你只关心总费用。您应该关注启动成本的时间是执行计划的该组件与（例如）EXISTS子句相关，其中只需要返回第一行（因此您只关心启动成本，而非总数，因为它在启动后几乎立即退出。）

PostgreSQL documentation on EXPLAIN详细介绍了这一点。

Answer 2

当您是服务器的唯一用户时，查询可能（并且应该是，不包括特殊情况）在响应时间内更可预测。对于云服务器，您对实际服务器负载一无所知，即使您的查询是在数据库上执行的唯一查询，因为服务器很可能同时支持多个数据库。当您询问响应时间时，通过网络访问远程服务器可能还会遇到各种情况。

Answer 3

After investigation of the historical load, we have found out that the provisioned IOPS we originally configured had been exhausted during the last set of load tests performed on the environment.

According to Amazon's documentation @http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html, after this point, Amazon does not guarantee consistency in execution times and the SLAs are no longer applicable.

We have confirmed that replicating the database onto a new instance of AWS RDS with same configuration yields consistent response times when executing the query multiple times.

为什么同一查询的响应时间会有所不同？

3 个答案: