将ORDER BY从id更改为另一个索引列(具有低LIMIT)会产生巨大的成本

时间:2014-07-21 18:55:11

标签: sql postgresql postgresql-9.1

我在500 000行表上查询。

基本上

WHERE s3_.id = 287
ORDER BY m0_.id DESC
LIMIT 25

=>查询运行时= 20ms

WHERE s3_.id = 287
ORDER BY m0_.created_at DESC
LIMIT 25

=>查询运行时= 15000ms或更长

created_at上有一个索引。

查询计划完全不同。

不幸的是,我不是一个查询计划大师。我想在按created_at订购时重现快速查询计划。

这可能吗?我该怎么做?

查询计划 - 慢查询(按m0_.created_at排序):http://explain.depesz.com/s/KBl

查询计划 - 快速查询(按m0_.id排序):http://explain.depesz.com/s/2pYZ

完整查询

SELECT m0_.id AS id0, m0_.content AS content1, m0_.created_at AS created_at2,
c1_.id AS id3, l2_.id AS id4, l2_.reference AS reference5,
s3_.id AS id6, s3_.name AS name7, s3_.code AS code8,
u4_.email AS email9, u4_.id AS id10, u4_.firstname AS firstname11, u4_.lastname AS lastname12,
u5_.email AS email13, u5_.id AS id14, u5_.firstname AS firstname15, u5_.lastname AS lastname16,
g6_.id AS id17, g6_.firstname AS firstname18, g6_.lastname AS lastname19, g6_.email AS email20,
m0_.conversation_id AS conversation_id21, m0_.author_user_id AS author_user_id22, m0_.author_guest_id AS author_guest_id23,
c1_.author_user_id AS author_user_id24, c1_.author_guest_id AS author_guest_id25, c1_.listing_id AS listing_id26,
l2_.poster_id AS poster_id27, l2_.site_id AS site_id28, l2_.building_id AS building_id29, l2_.type_id AS type_id30, l2_.neighborhood_id AS neighborhood_id31, l2_.facility_bathroom_id AS facility_bathroom_id32, l2_.facility_kitchen_id AS facility_kitchen_id33, l2_.facility_heating_id AS facility_heating_id34, l2_.facility_internet_id AS facility_internet_id35, l2_.facility_condition_id AS facility_condition_id36, l2_.original_translation_id AS original_translation_id37, 
u4_.site_id AS site_id38, u4_.address_id AS address_id39, u4_.billing_address_id AS billing_address_id40,
u5_.site_id AS site_id41, u5_.address_id AS address_id42, u5_.billing_address_id AS billing_address_id43,
g6_.site_id AS site_id44
FROM message m0_
INNER JOIN conversation c1_ ON m0_.conversation_id = c1_.id
INNER JOIN listing l2_ ON c1_.listing_id = l2_.id
INNER JOIN Site s3_ ON l2_.site_id = s3_.id
INNER JOIN user_ u4_ ON l2_.poster_id = u4_.id
LEFT JOIN user_ u5_ ON m0_.author_user_id = u5_.id
LEFT JOIN guest_data g6_ ON m0_.author_guest_id = g6_.id
WHERE s3_.id = 287
ORDER BY m0_.created_at DESC
LIMIT 25 OFFSET 0

3 个答案:

答案 0 :(得分:2)

原来是一个索引问题。查询的NULLS行为与索引不一致。

CREATE INDEX message_created_at_idx on message (created_at DESC NULLS LAST);

... ORDER BY message.created_at DESC; -- defaults to NULLS FIRST when DESC

解决方案

如果在索引或查询中指定NULLS,请确保它们彼此一致。

即:ASC NULLS LASTASC NULLS LASTDESC NULLS FIRST保持一致。

NULLS LAST

CREATE INDEX message_created_at_idx on message (created_at DESC NULLS LAST);

... ORDER BY messsage.created_at DESC NULLS LAST;

NULLS FIRST

CREATE INDEX message_created_at_idx on message (created_at DESC); -- defaults to NULLS FIRST when DESC

... ORDER BY messsage.created_at DESC -- defaults to NULLS FIRST when DESC;

NOT NULL

如果您的列不是NULL,请不要打扰NULLS。

CREATE INDEX message_created_at_idx on message (created_at DESC);

... ORDER BY messsage.created_at DESC;

答案 1 :(得分:1)

修正您的查询

您的WHERE条件位于通过LEFT JOIN个节点加入的表格中。 WHERE条件强制联接的行为类似于[INNER] JOIN。这是毫无意义的,可能会使查询计划程序混淆,特别是对于具有大量表的查询,因此许多可能的查询计划。通过设置正确,您可以大大减少可能的查询计划数量,使Postgres更容易找到一个好的查询计划。
More details in the answer to the additionally spawned question.

SELECT m0_.id AS id0, ...
FROM   site            s3_
JOIN   listing         l2_ ON l2_.site_id = s3_.id
JOIN   conversation    c1_ ON c1_.listing_id = l2_.id
JOIN   message         m0_ ON m0_.conversation_id = c1_.id

LEFT   JOIN user_      u4_ ON u4_.id = l2_.poster_id
LEFT   JOIN user_      u5_ ON u5_.id = m0_.author_user_id
LEFT   JOIN guest_data g6_ ON g6_.id = m0_.author_guest_id
WHERE  s3_.id = '287'  -- ??
ORDER  BY m0_.created_at DESC
LIMIT  25

为什么s3_.id = '287'

看起来287应该是integer类型,您通常会输入数字常量而不带引号:287。什么是实际的数据类型(以及为什么)?无论如何只有次要问题。

阅读查询计划

@FuzzyTree已经暗示(非常准确地)在与WHERE子句中使用的列不同的列上进行排序会使事情变得复杂。但这不是这里房间里的大象。

LIMIT 25的组合使差异巨大。这两个查询计划在最后一步显示从rows=124616减少到rows=25,这是巨大

两个查询计划还会显示:Seq Scan on site s3_ ... rows=1。因此,如果您在快速变体中ORDER BY _s3.id,则实际上并未订购任何。而另一个查询必须找到124616候选人中的前25行...几乎没有公平的比较。

解决方案

澄清后,问题似乎更清楚了。您按一个条件选择了大量行,但是按另一个条件排序。没有传统的索引设计可以涵盖这一点,即使两个列都位于同一个表中(他们也不会这样)。

我认为我们在dba.SE上的这个相关问题上找到了这类问题的一个(非平凡的)解决方案:

当然,query optimization和一般performance optimization的所有常规建议都适用。

答案 2 :(得分:0)

在您的第一个查询中,WHEREORDER BY都在id上,因此它可以利用相同的索引,而您的第二个查询的{{1}列具有不同的列}和WHERE

尝试添加综合索引,以便可以为ORDER BYWHERE使用相同的索引

ORDER BY