PostgreSQL递归CTE性能问题

时间:2016-09-11 22:02:46

标签: postgresql common-table-expression

我试图理解两个查询的性能差异如此巨大。

假设我有两张桌子。 第一个包含一组域的A记录:

                                Table "public.dns_a"
 Column |          Type          | Modifiers | Storage  | Stats target | Description
--------+------------------------+-----------+----------+--------------+-------------
 name   | character varying(125) |           | extended |              |             
 a      | inet                   |           | main     |              |             
Indexes:
    "dns_a_a_idx" btree (a)
    "dns_a_name_idx" btree (name varchar_pattern_ops)  

第二个表处理CNAME记录:

                              Table "public.dns_cname"
 Column |          Type          | Modifiers | Storage  | Stats target | Description
--------+------------------------+-----------+----------+--------------+-------------
 name   | character varying(256) |           | extended |              |             
 cname  | character varying(256) |           | extended |              |             
Indexes:
    "dns_cname_cname_idx" btree (cname varchar_pattern_ops)
    "dns_cname_name_idx" btree (name varchar_pattern_ops)  

现在我正试图解决简单的问题"使所有域指向同一IP地址的问题,包括CNAME。

第一次使用CTE的尝试很好:

EXPLAIN ANALYZE WITH RECURSIVE names_traverse AS (
    (
        SELECT name::varchar(256), NULL::varchar(256) as cname, a FROM dns_a WHERE a = '118.145.5.20'
    )
    UNION ALL
        SELECT c.name, c.cname, NULL::inet as a FROM names_traverse nt, dns_cname c WHERE c.cname=nt.name
)
SELECT * FROM names_traverse; 

                                                                               QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on names_traverse  (cost=3051757.20..4337044.86 rows=64264383 width=1064) (actual time=0.037..1697.444 rows=199 loops=1)
   CTE names_traverse
     ->  Recursive Union  (cost=0.57..3051757.20 rows=64264383 width=45) (actual time=0.036..1697.395 rows=199 loops=1)
           ->  Index Scan using dns_a_a_idx on dns_a  (cost=0.57..1988.89 rows=1953 width=24) (actual time=0.035..0.064 rows=14 loops=1)
                 Index Cond: (a = '118.145.5.20'::inet)
           ->  Merge Join  (cost=4377.00..176448.06 rows=6426243 width=45) (actual time=498.101..848.648 rows=92 loops=2)
                 Merge Cond: ((c.cname)::text = (nt.name)::text)
                 ->  Index Scan using dns_cname_cname_idx on dns_cname c  (cost=0.56..69958.06 rows=2268434 width=45) (actual time=4.732..688.456 rows=2219973 loops=2)
                 ->  Materialize  (cost=4376.44..4474.09 rows=19530 width=516) (actual time=0.039..0.084 rows=187 loops=2)
                       ->  Sort  (cost=4376.44..4425.27 rows=19530 width=516) (actual time=0.037..0.053 rows=100 loops=2)
                             Sort Key: nt.name USING ~<~
                             Sort Method: quicksort  Memory: 33kB
                             ->  WorkTable Scan on names_traverse nt  (cost=0.00..390.60 rows=19530 width=516) (actual time=0.001..0.007 rows=100 loops=2)
 Planning time: 0.130 ms
 Execution time: 1697.477 ms
(15 rows)         

上面的例子中有两个循环,所以如果我做一个简单的外连接查询,我会得到更好的结果:

EXPLAIN ANALYZE 
SELECT * 
FROM dns_a a 
LEFT JOIN dns_cname c1 ON (c1.cname=a.name) 
LEFT JOIN dns_cname c2 ON (c2.cname=c1.name) 
WHERE a.a='118.145.5.20';

                                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=1.68..65674.19 rows=1953 width=114) (actual time=1.086..12.992 rows=189 loops=1)
   ->  Nested Loop Left Join  (cost=1.12..46889.57 rows=1953 width=69) (actual time=1.085..2.154 rows=189 loops=1)
         ->  Index Scan using dns_a_a_idx on dns_a a  (cost=0.57..1988.89 rows=1953 width=24) (actual time=0.022..0.055 rows=14 loops=1)
               Index Cond: (a = '118.145.5.20'::inet)
         ->  Index Scan using dns_cname_cname_idx on dns_cname c1  (cost=0.56..19.70 rows=329 width=45) (actual time=0.137..0.148 rows=13 loops=14)
               Index Cond: ((cname)::text = (a.name)::text)
   ->  Index Scan using dns_cname_cname_idx on dns_cname c2  (cost=0.56..6.33 rows=329 width=45) (actual time=0.057..0.057 rows=0 loops=189)
         Index Cond: ((cname)::text = (c1.name)::text)
 Planning time: 0.452 ms
 Execution time: 13.012 ms
(10 rows)

Time: 13.787 ms 

所以,性能差异大约是100倍,这让我很担心。 我喜欢递归CTE的便利性,而不喜欢在应用程序端使用它而不是做肮脏的技巧,但我不明白为什么Index Scan using dns_cname_cname_idx on dns_cname c (cost=0.56..69958.06 rows=2268434 width=45) (actual time=4.732..688.456 rows=2219973 loops=2)的成本如此之高。

我是否遗漏了有关CTE的重要信息,或者问题是其他问题?

谢谢!

更新:我的一位朋友发现了我错过的受影响的行数Index Scan using dns_cname_cname_idx on dns_cname c (cost=0.56..69958.06 rows=2268434 width=45) (actual time=4.732..688.456 rows=2219973 loops=2),它等于表中的总行数,如果我理解正确的话,它会执行完整的索引无条件扫描,我不会错过任何条件。

结果:应用SET LOCAL enable_mergejoin TO false;后执行时间要好得多。

EXPLAIN ANALYZE WITH RECURSIVE names_traverse AS (
    (
        SELECT name::varchar(256), NULL::varchar(256) as cname, a FROM dns_a WHERE a = '118.145.5.20'
    )
    UNION ALL
        SELECT c.name, c.cname, NULL::inet as a FROM names_traverse nt, dns_cname c WHERE c.cname=nt.name
)
SELECT * FROM names_traverse;
                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on names_traverse  (cost=4746432.42..6527720.02 rows=89064380 width=1064) (actual time=0.718..45.656 rows=199 loops=1)
   CTE names_traverse
     ->  Recursive Union  (cost=0.57..4746432.42 rows=89064380 width=45) (actual time=0.717..45.597 rows=199 loops=1)
           ->  Index Scan using dns_a_a_idx on dns_a  (cost=0.57..74.82 rows=2700 width=24) (actual time=0.716..0.717 rows=14 loops=1)
                 Index Cond: (a = '118.145.5.20'::inet)
           ->  Nested Loop  (cost=0.56..296507.00 rows=8906168 width=45) (actual time=11.276..22.418 rows=92 loops=2)
                 ->  WorkTable Scan on names_traverse nt  (cost=0.00..540.00 rows=27000 width=516) (actual time=0.000..0.013 rows=100 loops=2)
                 ->  Index Scan using dns_cname_cname_idx on dns_cname c  (cost=0.56..7.66 rows=330 width=45) (actual time=0.125..0.225 rows=1 loops=199)
                       Index Cond: ((cname)::text = (nt.name)::text)
 Planning time: 0.253 ms
 Execution time: 45.697 ms
(11 rows)

1 个答案:

答案 0 :(得分:3)

如您所述,由于索引扫描,第一个查询很慢。

计划必须扫描整个索引,以便dns_cnamecname排序,这是合并连接所需的。合并连接要求两个输入表都按连接键排序,这可以通过对整个表进行索引扫描(如本例中所示),也可以通过顺序扫描后跟显式排序来完成。

您会注意到计划员严重高估了CTE评估的所有行数,这可能是问题的根源。对于较少的行,PostgreSQL可能会选择嵌套的循环连接,而不必扫描整个表dns_cname

这可能是可以修复的。我可以立即看到的一件事是,初始值'118.145.5.20'的估计值太高了139.5,这非常糟糕。您可以通过在dns_cname上运行ANALYZE来解决此问题,可能是在增加了该列的statistics target之后:

ALTER TABLE dns_a ALTER a SET STATISTICS 1000;

看看是否有所作为。

如果没有做到这一点,您可以手动将enable_mergejoinenable_hashjoin设置为off,看看带有嵌套循环连接的计划是否真的更好。如果您只是为这一个语句更改这些参数(可能使用SET LOCAL)并以这种方式获得更好的结果,那么这是您的另一个选择。