Question

以下是设置为enable_seqscan = true的查询的EXPLAIN输出。

 Hash Join  (cost=1028288.04..278841855100.04 rows=429471108 width=125)
   Hash Cond: ((u.destination)::text = (n.mid)::text)
   ->  Nested Loop  (cost=0.00..278587474234.17 rows=429471108 width=112)
     Join Filter: (((u.destination)::text <> (u2.mid)::text) AND ("position"((u2.path_name)::text, (suffix(u.path_name))::text) = 0) AND (((prefix((u.path_name)::text))::text = (prefix((u2.path_name)::text))::text) OR ((prefix((u.path_name)::text))::text = 'common'::text)))
     ->  Seq Scan on unresolved u2  (cost=0.00..2780546.32 rows=117608632 width=79)
     ->  Index Scan using unresolved__mid on unresolved u  (cost=0.00..1864.44 rows=492 width=53)
           Index Cond: ((u.mid)::text = (u2.destination)::text)
   ->  Hash  (cost=488335.24..488335.24 rows=27237024 width=33)
     ->  Seq Scan on name n  (cost=0.00..488335.24 rows=27237024 width=33)

（9行）

以下是同一查询的EXPLAIN输出，但设置为enable_seqscan = false。

 Hash Join  (cost=102089128.45..279381508122.13 rows=429471108 width=125)
   Hash Cond: ((u.destination)::text = (n.mid)::text)
   ->  Nested Loop  (cost=0.00..279026066415.86 rows=429471108 width=112)
     Join Filter: (((u.destination)::text <> (u2.mid)::text) AND ("position"((u2.path_name)::text, (suffix(u.path_name))::text) = 0) AND (((prefix((u.path_name)::text))::text = (prefix((u2.path_name)::text))::text) OR ((prefix((u.path_name)::text))::text = 'common'::text)))
     ->  Index Scan using unresolved__destination on unresolved u2  (cost=0.00..441372728.01 rows=117608632 width=79)
     ->  Index Scan using unresolved__mid on unresolved u  (cost=0.00..1864.44 rows=492 width=53)
           Index Cond: ((u.mid)::text = (u2.destination)::text)
   ->  Hash  (cost=101549175.65..101549175.65 rows=27237024 width=33)
     ->  Index Scan using name_pkey on name n  (cost=0.00..101549175.65 rows=27237024 width=33)

（9行）

我想知道查询需要多长时间。现在已经运行了大约10个小时。估计时间是从第一行的“成本”中推算出来的，在后者的情况下是'279381508122.13 ms'是8.8年？！： - （

Answer 1

这些数字与时间不符。它们只是相对数字。从文档（Using Explain）：

成本是任意测量的单位由计划者的成本决定参数（见第18.6.2节）。传统的做法是衡量以磁盘页面提取为单位的成本; 也就是说，seq_page_cost是通常设置为1.0和其他成本参数是相对设置的那个。（本节中的示例以默认成本运行参数。）

无论如何，由于某种模糊的连接条件而导致的嵌套循环似乎会破坏你的表现。很难说没有看到原始查询和表/索引结构，但您可能会发现在 unresolved 上创建功能索引的好处，假设“prefix（）”是一个IMMUTABLE函数：

CREATE INDEX idx_path_name_prefix ON unresolved (prefix(path_name));

推导使用EXPLAIN估计所花费的时间

1 个答案: