Question

我有这个功能，它有效，它提供了最新的b记录。

create or replace function most_recent_b(the_a a) returns b as $$
    select distinct on (c.a_id) b.*
    from c 
    join b on b.c_id = c.id
    where c.a_id = the_a.id 
    order by c.a_id, b.date desc
$$ language sql stable;

使用真实数据运行~5000ms。 V.S.以下以500毫秒运行

create or replace function most_recent_b(the_a a) returns b as $$
    select distinct on (c.a_id) b.*
    from c 
    join b on b.c_id = c.id
    where c.a_id = 1347 
    order by c.a_id, b.date desc
$$ language sql stable;

唯一不同的是，我使用值a.id对1347进行了硬编码，而不是使用其参数值。

在没有函数的情况下运行此查询也可以让我的速度大约为500毫秒

我正在运行PostgreSQL 9.6，所以查询规划器在函数结果中失败我看到其他地方的建议应该不适用于我吗？

我确定它不是查询本身就是问题，因为这是我的第三次迭代，获得此结果的不同技术都会导致在函数内部同样减慢。

根据@ laurenz-albe的要求

具有常量

的EXPLAIN（ANALYZE，BUFFERS）的结果

Unique  (cost=60.88..60.89 rows=3 width=463) (actual time=520.117..520.122 rows=1 loops=1)
  Buffers: shared hit=14555
  ->  Sort  (cost=60.88..60.89 rows=3 width=463) (actual time=520.116..520.120 rows=9 loops=1)
        Sort Key: b.date DESC
        Sort Method: quicksort  Memory: 28kB
        Buffers: shared hit=14555
        ->  Hash Join  (cost=13.71..60.86 rows=3 width=463) (actual time=386.848..520.083 rows=9 loops=1)
              Hash Cond: (b.c_id = c.id)
              Buffers: shared hit=14555
              ->  Seq Scan on b (cost=0.00..46.38 rows=54 width=459) (actual time=25.362..519.140 rows=51 loops=1)
                    Filter: b_can_view(b.*)
                    Rows Removed by Filter: 112
                    Buffers: shared hit=14530
              ->  Hash  (cost=13.67..13.67 rows=3 width=8) (actual time=0.880..0.880 rows=10 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 9kB
                    Buffers: shared hit=25
                    ->  Subquery Scan on c  (cost=4.21..13.67 rows=3 width=8) (actual time=0.222..0.872 rows=10 loops=1)
                          Buffers: shared hit=25
                          ->  Bitmap Heap Scan on c c_1  (cost=4.21..13.64 rows=3 width=2276) (actual time=0.221..0.863 rows=10 loops=1)
                                Recheck Cond: (a_id = 1347)
                                Filter: c_can_view(c_1.*)
                                Heap Blocks: exact=4
                                Buffers: shared hit=25
                                ->  Bitmap Index Scan on c_a_id_c_number_idx  (cost=0.00..4.20 rows=8 width=0) (actual time=0.007..0.007 rows=10 loops=1)
                                      Index Cond: (a_id = 1347)
                                      Buffers: shared hit=1
Execution time: 520.256 ms

这是通过参数传递六次之后的结果（它正好是你预测的六倍:)）慢查询;

Unique  (cost=57.07..57.07 rows=1 width=463) (actual time=5040.237..5040.243 rows=1 loops=1)
  Buffers: shared hit=145325
  ->  Sort  (cost=57.07..57.07 rows=1 width=463) (actual time=5040.237..5040.240 rows=9 loops=1)
        Sort Key: b.date DESC
        Sort Method: quicksort  Memory: 28kB
        Buffers: shared hit=145325
        ->  Nested Loop  (cost=0.14..57.06 rows=1 width=463) (actual time=912.354..5040.195 rows=9 loops=1)
              Join Filter: (c.id = b.c_id)
              Rows Removed by Join Filter: 501
              Buffers: shared hit=145325
              ->  Index Scan using c_a_id_idx on c (cost=0.14..9.45 rows=1 width=2276) (actual time=0.378..1.171 rows=10 loops=1)
                    Index Cond: (a_id = $1)
                    Filter: c_can_view(c.*)
                    Buffers: shared hit=25
              ->  Seq Scan on b (cost=0.00..46.38 rows=54 width=459) (actual time=24.842..503.854 rows=51 loops=10)
                    Filter: b_can_view(b.*)
                    Rows Removed by Filter: 112
                    Buffers: shared hit=145300
Execution time: 5040.375 ms

值得注意的是，我有一些严格的行级安全性，我怀疑这就是为什么这些查询都很慢，但是，一个比另一个慢10倍。

我已经改变了我的原始表名，希望我的搜索和替换在这里很好。

Answer 1

查询执行的昂贵部分是过滤器Seq Scan on b (cost=0.00..46.38 rows=54 width=459) (actual time=25.362..519.140 rows=51 loops=1) Filter: b_can_view(b.*) Rows Removed by Filter: 112 Buffers: shared hit=14530，它必须来自行级安全性定义。

快速执行：

Seq Scan on b (cost=0.00..46.38 rows=54 width=459)
              (actual time=24.842..503.854 rows=51 loops=10)
  Filter: b_can_view(b.*)
  Rows Removed by Filter: 112
  Buffers: shared hit=145300

执行缓慢：

loops=10

不同之处在于，在慢速情况下（c）执行扫描10次，并触及10倍数据块。

使用通用计划时，PostgreSQL会低估c.a_id = $1中满足条件c的行数，因为它不知道实际值是1347，这比平均值更频繁。

由于PostgreSQL认为最多只有一行来自b，所以它会选择嵌套的循环连接，并在内侧连续扫描b_can_view。

现在有两个问题结合在一起：

调用函数c每行占用3毫秒（PostgreSQL不知道），占163行顺序扫描的半秒钟。
b实际上有10行而不是预测的1行，因此表b_can_view被扫描10次，最终查询持续时间为5秒。

那你能做什么？

告诉PostgreSQL ALTER TABLE有多贵。使用COST将该函数的b(c_id)设置为1000或10000以反映现实。仅仅这一点还不足以获得更快的计划，因为PostgreSQL认为它必须执行单个顺序扫描，但为优化器提供正确的数据是一件好事。
在b上创建索引。这将使PostgreSQL能够避免对b_can_view进行顺序扫描，一旦它意识到函数的价格，它就会尝试这样做。

另外，尝试使函数{{1}}更便宜。这将使您的体验变得更好。

与直接sql相比，Postgres 9.6功能表现不佳

1 个答案: