Postgres 选择错误的执行计划。 vaccuum analysis 似乎没有改变主意

时间:2021-02-02 06:56:47

标签: postgresql

这个查询

select "key","job","sentDate","scheduledDate","status","recipient","mergeVariables","opens","clicks","smtpEvents", "$$meta.deleted", "$$meta.created", "$$meta.modified", "$$meta.version", "$$meta.deleted", "$$meta.created", "$$meta.modified" from "emails" 
where "emails"."$$meta.deleted" = false 
and "job" in ('6f0b0288-6edd-408f-a0a9-8406fcf4bd88','a36c901c-b2df-427f-8a83-8e7072c1ad55','87127ee7-b13f-4a60-b981-65ea91988bcd','76a3eef0-32b3-4cd2-b6df-2e3360ec484a','893fd688-e789-49b8-9f95-cbaf84520852','3dc85b85-2de4-4e71-b9e0-26dbb122acfd','ae0615b1-2520-45d0-9159-b7794e535bc5','39562342-afa3-4054-82c6-cda103b205d7','6995b876-1781-4e84-b6cb-437ccb45fd4c','adfd15ce-e68a-405c-a18c-fa01daa711ea','901a9e3a-2c0d-476d-97b3-c64b954f0ecc','e6dec9f0-f670-4187-b0a4-fb8a676f0016','a373c541-32c8-4070-8ac0-209683257fe5','5ff6cec9-794f-49f1-9043-cc9120c3b1d4','3d2226c2-7559-41c7-b1e5-688830693ca2','b08bcf0b-fde2-4079-bfc4-aeff9bac48a7','e82eae7f-4e41-410d-9eb9-4b49d7ccde11','d0ae300f-5f8a-4851-9c56-3d87dad3ce2e','fc3c11df-7cd6-4819-888f-8abd2e32367b','0a27151a-3f33-488c-a3e6-4ffcfe9f7020','9b89d3f2-4484-4109-aaeb-382ef480b0e1','9ef54c01-2fe2-4ed9-8d34-6f3fa8108040','21a63e3a-bbdc-43e3-9c73-73c94e8a3ac3','f90ac6c0-d422-4e8c-9dcc-64a8a38f15e4','39d2c420-fbb1-4883-a184-9670c3b5ecb6','63681ea2-e567-4f6f-9b64-32c63d7d7f67','38d7e27b-86ce-4e05-a2eb-a50925e8afab','96dd4ec2-f2d6-44b4-bb0c-97025f6af7e5','4a9cb1ab-6a3e-47d5-9348-51efd918883f','cdd6b061-3a05-47cf-bba3-067ce03d81e7','ef9b60da-b26e-4f60-805a-5b1778a08288','0dcb9ea7-fa78-4c64-bf4b-4d62eb27be8e','104f4306-042c-4df9-bbf2-7c6d7ec5999d','340e95e3-0ff1-435b-babb-029533cd67ff','f5e8c4a1-a0ec-44b7-a84c-4e2bea03dab6','3acb7147-1fc9-4911-93b1-28b0b3316027','7874342e-b3d2-48e1-9ab8-bf3896ff5d69','3c1083e9-9b62-4ae4-9969-55e5813cb566','77afa7ae-436c-4d81-8917-a7bd787447ca','6615613e-5e22-48ce-bca6-1098c086d194','e2e28dab-9e68-41ca-8f98-a95e5c711d7d','ac1140cb-b3f0-4d4c-9236-8242851a4594','53254e80-eaee-4609-b141-b5f3eb50b33d','565fc864-5088-47f4-a5d3-4d1ea2b74e4b','fa4c7805-7208-4a17-9fcb-bfb4cdbbb6b8','7b5a0507-b59c-4de5-b738-7095c561fdd7','4727cc2b-7cb2-4009-bcd5-d20b22390b7d','7ed66544-7d5a-4eda-9cbf-f2a9ab2e8714','327989bd-83d7-4950-81a3-d9569f4b9bd8','aa4ae2b3-b3b0-41ef-8e7b-894fa85c9c70','3fe328f5-5ee6-46bb-8448-dd7e306400fb','29b2c9e0-8302-44bd-939d-3d8a5e242902','7303852d-6b5f-4210-a2df-b755dcc81417','6cb6d3ca-9ba5-4c2d-8cf3-08168cd14933','12fd7ace-5755-4b71-ac9f-cd0fb529873e','dd3a2020-3378-4603-8d70-60047e8189cc','cc9240c4-9d28-4d82-825d-bb88be3ec640','bbf5f70d-d828-44e1-867e-ff29c8945c1b','a11f458e-3176-4ffd-aaf4-71c6aca70a53','24d574d0-57de-41b3-9380-8968c0ab02f5','57de56d7-cdc2-4004-9853-65df9c3e0871','cef46cd7-bac5-4f46-a9a5-b0941553e3ce','9d344984-8164-45ce-a58a-715599f5fd0c','59598d21-90d5-4952-88be-e128aedda324','0f7c1a47-0a65-4d6e-a4e6-7ad5ea5e3afc','78a517bb-9686-4049-9dc2-43bba20916cb','fcf999fe-bbbf-462a-be89-2b6993501c6f','44f9ec05-5408-4778-9dc8-adbd43443af7','25766690-819a-42c1-84af-04149691a852','a923dfec-b368-49bb-af76-7a542bb5b3fb','fee7d0ca-3d74-46a1-8c22-31285af660fb','3b24f58d-1203-45d5-b718-1e2bc51bd811','48d36f2f-aa31-4018-8318-1a0e8c7d20ba','b43093b2-50ad-498f-84b1-b0181ac54d0e','b0310d94-f516-49ea-97f1-2289725a7bdb','9a58a202-91e0-47b5-bc2c-346885ab21ab','0430cc10-7141-4cc3-b4e0-07a4327e9f75','63986387-5157-4f7e-9a22-b667bc82de8f','78339b8b-1351-401d-8c03-1b5674c87f9c','9b97dadb-366a-431f-9af4-192844e9ea86','b5504148-8231-41dd-a316-96b2ec2b4b24','58ab8320-5a21-42f6-b8f2-c487cff59116','a322eb6e-6fb6-4a2a-9dee-293bf9285ae5','2f621d24-7927-4be5-a31b-8ac896cb5c21','865bfbbc-c2d7-466d-8ab3-3b1e30e87b68','f33fb2d8-9a12-4aea-bfa4-55eadbbe63c8','7f929ff1-da47-41e9-abc9-dc26f9158ae2','7a33ccb8-5728-4153-97bc-9bf85b715b20','1736fe52-9a78-442d-8d3d-d14e50791b47','9eb276eb-273d-4e20-8bf3-7fa59ab41cc5','1767d575-c13d-46be-903c-23c667341968','ad6c5b99-4840-4970-b621-0b24992ddcd7','f6b6e795-e53b-4443-8922-3011814651d9','91e17445-4349-4577-b327-90482baa177c','90c42f66-40c4-4121-901f-e27aa94818b8','7aa8f73a-b6f8-4b76-9ad7-c0e0e9ca5158')
order by "$$meta.created" asc,"key" asc limit 500

大约需要 4 分钟,因为它使用的是按索引排序 (https://explain.depesz.com/s/FXqx)。但是,如果我将其更改为限制 5000,则会在 <100 毫秒 (https://explain.depesz.com/s/OajM)

内完成

我已经运行了 vaccuum analysis 等,但它似乎没有改变执行路径。 我正在寻求建议,但不更改限制 500 与 5000 的查询。理想情况下,仅在没有作业过滤器的情况下才使用排序索引。

编辑: 我有这些索引:

CREATE INDEX "emails_$$meta.created_key_idx"
ON vsko_mailer_api_prod.emails USING btree
("$$meta.created" ASC NULLS LAST, key ASC NULLS LAST)
TABLESPACE pg_default
WHERE NOT "$$meta.deleted";

还有:

CREATE INDEX emails_emailjob
ON vsko_mailer_api_prod.emails USING btree
(job ASC NULLS LAST)
TABLESPACE pg_default;

我认为还有其他一些不相关的。

我刚刚添加了这个:

CREATE INDEX emails_emailjob_not_deleted
ON vsko_mailer_api_prod.emails USING hash
(job)
TABLESPACE pg_default
WHERE NOT "$$meta.deleted";

它使限制 5000 更快,但限制 500 没有区别

编辑 2:

https://explain.depesz.com/s/DOyT(限制为 5000) https://explain.depesz.com/s/Pi93(限制为 500)

所有索引:

"emails"    "emails_$$meta.created_key_idx" "CREATE INDEX ""emails_$$meta.created_key_idx"" ON vsko_mailer_api_prod.emails USING btree (""$$meta.created"", key) WHERE (NOT ""$$meta.deleted"")"
"emails"    "emails_created"    "CREATE INDEX emails_created ON vsko_mailer_api_prod.emails USING btree (""$$meta.created"")"
"emails"    "emails_deleted"    "CREATE INDEX emails_deleted ON vsko_mailer_api_prod.emails USING btree (""$$meta.deleted"")"
"emails"    "emails_emailjob"   "CREATE INDEX emails_emailjob ON vsko_mailer_api_prod.emails USING btree (job)"
"emails"    "emails_emailjob_not_deleted"   "CREATE INDEX emails_emailjob_not_deleted ON vsko_mailer_api_prod.emails USING hash (job) WHERE (NOT ""$$meta.deleted"")"
"emails"    "emails_lowered_job"    "CREATE INDEX emails_lowered_job ON vsko_mailer_api_prod.emails USING btree (lower((job)::text))"
"emails"    "emails_modified"   "CREATE INDEX emails_modified ON vsko_mailer_api_prod.emails USING btree (""$$meta.modified"")"
"emails"    "emails_ordered_created"    "CREATE INDEX emails_ordered_created ON vsko_mailer_api_prod.emails USING btree (""$$meta.created"") WHERE (""$$meta.deleted"" = false)"
"emails"    "emails_ordered_created_and_keys"   "CREATE INDEX emails_ordered_created_and_keys ON vsko_mailer_api_prod.emails USING btree (""$$meta.created"", key)"
"emails"    "emails_ordered_sentdate"   "CREATE INDEX emails_ordered_sentdate ON vsko_mailer_api_prod.emails USING btree (""sentDate"" DESC)"
"emails"    "emails_pkey"   "CREATE UNIQUE INDEX emails_pkey ON vsko_mailer_api_prod.emails USING btree (key)"
"emails"    "emails_status" "CREATE INDEX emails_status ON vsko_mailer_api_prod.emails USING btree (status)"
"emails"    "lowered_recipient_emailaddress_emails" "CREATE INDEX lowered_recipient_emailaddress_emails ON vsko_mailer_api_prod.emails USING btree (lower(((recipient)::json ->> 'emailAddress'::text)))"
"emails"    "lowered_recipient_person_href" "CREATE INDEX lowered_recipient_person_href ON vsko_mailer_api_prod.emails USING btree (lower(((((recipient)::json ->> 'person'::text))::json ->> 'href'::text)))"

2 个答案:

答案 0 :(得分:2)

您的第一次尝试应该是改进估计,以便 PostgreSQL 选择正确的计划。这可以通过更好的统计数据来完成:

ALTER TABLE emails ALTER job SET STATISTICS 1000;
ANALYZE emails;

您最多可以试验 10000 个值。

如果失败,您可以更改ORDER BY子句,使其不受索引支持,那么PostgreSQL将始终使用job上的索引:

...
ORDER BY "$$meta.created" + INTERVAL '0 days', key

这里我假设 "$$meta.created" 是一个时间戳;如果不是,请添加其他内容。

答案 1 :(得分:1)

创建测试数据:

BEGIN;
CREATE TABLE foo( id INTEGER NOT NULL, 
  key INTEGER NOT NULL, job INTEGER NOT NULL, 
  created INTEGER NOT NULL, dummy INTEGER NOT NULL,
  deleted BOOL NOT NULL );
INSERT INTO foo SELECT n, random()*10000, random()*10000, n+random()*10000, 1,
    random()>0.1 FROM generate_series(1,1000000) n;
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;
VACUUM ANALYZE foo;

CREATE INDEX foo_job_not_deleted ON foo(job) WHERE NOT deleted;
CREATE INDEX foo_created ON foo(created,key) WHERE NOT deleted;
CREATE INDEX foo_created1 ON foo(created);

有了这些,我也明白了你的坏计划。

一种解决方案是使用 LATERAL JOIN 强制嵌套循环:

SELECT foo2.* FROM (VALUES (6479),(672),(6264),(5911),(6161),(7704),(2609),(4095),(271),(2363),(7299),(7330),(1990),(6523),(9261),(9490),(5013),(1131),(585),(8881),(8379),(1543),(5911),(7243),(3608),(9199),(8950),(1485),(7159),(2126),(2876),(779),(6890),(4315),(2253),(3909),(7355),(2876),(9981),(6653),(8407),(1772),(1348),(5689),(2857),(3535),(7607),(6275),(7596),(1885),(6827),(4180),(4638),(1876),(9403),(4195),(2548),(2827),(7972),(5571),(8426),(7761),(6400),(9175),(7486),(589),(3538),(8495),(2864),(5349),(4834),(1357),(6778),(6232),(7457),(6740),(5011),(946),(2918),(9981),(6903),(5565),(9396),(4482),(9796),(5925),(4971),(1304),(71),(7926),(2173),(3439),(7508),(7763),(4890),(5660),(8436),(8828),(5524),(6418)) jobs 
  JOIN LATERAL (SELECT * FROM foo WHERE foo.job=jobs.column1 AND NOT foo.deleted) foo2 ON (true)
 ORDER BY created,key LIMIT 500;

对于 VALUES 子句中的每个作业,将独立评估 LATERAL JOIN 子查询。由于它只命中一个作业值,这是表的一小部分,这迫使优化器使用索引来执行子查询。

如果表包含很多列,尤其是大 TEXT 列,并且子查询返回很多行,这些行将被 LIMIT 子句提取然后删除,那么在子查询中只提取主键可能是有益的,然后在 LIMIT 之后,与主表连接以获取您想要的所有列,仅从实际出现在最终结果中的行中获取。