推导使用EXPLAIN估计所花费的时间

时间:2011-07-02 01:43:33

标签: postgresql

以下是设置为enable_seqscan = true的查询的EXPLAIN输出。

 Hash Join  (cost=1028288.04..278841855100.04 rows=429471108 width=125)
   Hash Cond: ((u.destination)::text = (n.mid)::text)
   ->  Nested Loop  (cost=0.00..278587474234.17 rows=429471108 width=112)
     Join Filter: (((u.destination)::text <> (u2.mid)::text) AND ("position"((u2.path_name)::text, (suffix(u.path_name))::text) = 0) AND (((prefix((u.path_name)::text))::text = (prefix((u2.path_name)::text))::text) OR ((prefix((u.path_name)::text))::text = 'common'::text)))
     ->  Seq Scan on unresolved u2  (cost=0.00..2780546.32 rows=117608632 width=79)
     ->  Index Scan using unresolved__mid on unresolved u  (cost=0.00..1864.44 rows=492 width=53)
           Index Cond: ((u.mid)::text = (u2.destination)::text)
   ->  Hash  (cost=488335.24..488335.24 rows=27237024 width=33)
     ->  Seq Scan on name n  (cost=0.00..488335.24 rows=27237024 width=33)

(9行)

以下是同一查询的EXPLAIN输出,但设置为enable_seqscan = false。

 Hash Join  (cost=102089128.45..279381508122.13 rows=429471108 width=125)
   Hash Cond: ((u.destination)::text = (n.mid)::text)
   ->  Nested Loop  (cost=0.00..279026066415.86 rows=429471108 width=112)
     Join Filter: (((u.destination)::text <> (u2.mid)::text) AND ("position"((u2.path_name)::text, (suffix(u.path_name))::text) = 0) AND (((prefix((u.path_name)::text))::text = (prefix((u2.path_name)::text))::text) OR ((prefix((u.path_name)::text))::text = 'common'::text)))
     ->  Index Scan using unresolved__destination on unresolved u2  (cost=0.00..441372728.01 rows=117608632 width=79)
     ->  Index Scan using unresolved__mid on unresolved u  (cost=0.00..1864.44 rows=492 width=53)
           Index Cond: ((u.mid)::text = (u2.destination)::text)
   ->  Hash  (cost=101549175.65..101549175.65 rows=27237024 width=33)
     ->  Index Scan using name_pkey on name n  (cost=0.00..101549175.65 rows=27237024 width=33)

(9行)

我想知道查询需要多长时间。现在已经运行了大约10个小时。估计时间是从第一行的“成本”中推算出来的,在后者的情况下是'279381508122.13 ms'是8.8年?! : - (

1 个答案:

答案 0 :(得分:1)

这些数字与时间不符。它们只是相对数字。从文档(Using Explain):

  

成本是任意测量的   单位由计划者的成本决定   参数(见第18.6.2节)。   传统的做法是衡量   以磁盘页面提取为单位的成本;   也就是说,seq_page_cost是   通常设置为1.0和   其他成本参数是相对设置的   那个。 (本节中的示例   以默认成本运行   参数。)

无论如何,由于某种模糊的连接条件而导致的嵌套循环似乎会破坏你的表现。很难说没有看到原始查询和表/索引结构,但您可能会发现在 unresolved 上创建功能索引的好处,假设“prefix()”是一个IMMUTABLE函数:

CREATE INDEX idx_path_name_prefix ON unresolved (prefix(path_name));