我的查询看起来有点像这样(注意:实际查询是由Hibernate生成的,有点复杂):
select * from outage_revisions orev
join outages o
on orev.outage=o.id
where o.observed_end is null
and orev.observation_date =
(select max(observation_date)
from outage_revisions orev2
where orev2.observation_date <= '2011-11-21 00:00:00'
and orev2.outage = orev.outage);
此查询运行速度非常慢(约15分钟)。但是,如果我用子查询取出where
子句的一部分,它几乎立即返回(大约83毫秒),只有大约14行。
此外,子查询本身非常快(约31毫秒):
select max(observation_date) from outage_revisions orev2
where orev2.observation_date <= '2011-11-21 00:00:00'
and orev2.outage = 1
我的问题是:如果除了子查询过滤器之外只有完整查询返回的14行,为什么添加子查询会使查询变慢?子查询不应该最多添加大约31 * 14毫秒吗?
以下是完整查询的计划:
Nested Loop (cost=0.00..71078813.16 rows=1 width=115)
-> Seq Scan on outagerevisions orev (cost=0.00..71077624.67 rows=284 width=79)
Filter: (observationdate = (SubPlan 2))
SubPlan 2
-> Result (cost=1250.56..1250.57 rows=1 width=0)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..1250.56 rows=1 width=8)
-> Index Scan Backward using idx_observationdate on outagerevisions orev2 (cost=0.00..2501.12 rows=2 width=8)
Index Cond: (observationdate <= '2011-11-21 00:00:00'::timestamp without time zone)
Filter: ((observationdate IS NOT NULL) AND (outage = $0))
-> Index Scan using outages_pkey on outages o (cost=0.00..4.17 rows=1 width=36)
Index Cond: (o.id = orev.outage)
Filter: (o.observedend IS NULL)
答案 0 :(得分:3)
我的猜测是,PostgreSQL只是在执行查询方面做出了糟糕的选择。虽然在执行相关子查询之前它似乎应该缩小到9行,但它可能不会这样做,因此子查询必须运行60,000次。虽然它正在这样做,但它还必须跟踪哪些行将继续进行下一步,等等。
以下是您可以尝试编写的其他几种方法:
SELECT
<column list>
FROM
Outage_Revisions OREV
JOIN Outages O ON
OREV.outage = O.id
LEFT OUTER JOIN Outage_Revisions OREV2 ON
OREV2.outage = OREV.outage AND
OREV2.observation_date <= '2011-11-21 00:00:00' AND
OREV2.observation_date > OREV.observation_date
WHERE
O.observed_end IS NULL AND
OREV2.outage IS NULL
或 (假设PostgreSQL和Hibernate支持加入子查询)
SELECT
<column list>
FROM
Outage_Revisions OREV
JOIN Outages O ON
OREV.outage = O.id
JOIN (SELECT OREV2.outage, MAX(OREV2.observation_date) AS max_observation_date
FROM Outage_Revisions OREV2
WHERE OREV2.observation_date <= '2011-11-21 00:00:00'
GROUP BY OREV2.outage) SQ ON
SQ.outage = OREV.outage AND
SQ.max_observation_date = OREV.observation_date
WHERE
O.observed_end IS NULL
您可以使用最后一个查询中的联接顺序。