行级安全性查询计划比相同的非RLS查询慢{45}

时间:2016-09-22 18:38:46

标签: performance postgresql sql-execution-plan row-level-security

我在使查询规划器为启用了行级安全性(RLS)的表编写好的计划时遇到了一些麻烦。似乎所有需要的是从行级安全性启用表到非行级安全性启用表的连接,以强制执行错误的计划,即使两个表上都有适当的索引,规划者应该能够使用它。

有没有办法帮助规划师解决这个问题?或者在涉及RLS时某些统计数据不可用?

我尝试为不需要RLS的表启用RLS(使用USING (TRUE)添加一个广泛的开放策略),并且与不包含该表的策略具有相同的效果。

DROP SCHEMA IF EXISTS foo CASCADE;
CREATE SCHEMA foo;

CREATE TABLE foo.bar AS
SELECT generate_series(1,10000000) AS id, md5(random()::text) AS descr, random() * 5 + 1 AS licflag;

CREATE TABLE foo.baz AS
SELECT generate_series(1,10000000) AS id, md5(random()::text) AS descr, random() * 5 + 1 AS licflag;

CREATE UNIQUE INDEX ON foo.bar (id);
CREATE INDEX ON foo.bar (licflag);

CREATE UNIQUE INDEX ON foo.baz (id);
CREATE INDEX ON foo.baz (licflag);

ANALYZE foo.bar;
ANALYZE foo.baz;

ALTER TABLE foo.bar ENABLE ROW LEVEL SECURITY;
--ALTER TABLE foo.baz ENABLE ROW LEVEL SECURITY;

DROP ROLE IF EXISTS restricted;
CREATE ROLE restricted NOINHERIT;

REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA foo FROM restricted;
GRANT restricted to current_user;
GRANT USAGE ON SCHEMA foo TO restricted;
GRANT SELECT ON ALL TABLES IN SCHEMA foo TO restricted;

CREATE POLICY restrict_foo ON foo.bar 
FOR SELECT TO restricted
USING (licflag < 3);

/*
CREATE POLICY restrict_foo ON foo.baz 
FOR SELECT TO restricted
USING (TRUE);
*/

EXPLAIN ANALYZE
SELECT * 
FROM foo.bar f1
JOIN foo.baz f2 ON f1.id = f2.id 
WHERE f2.id BETWEEN 500 AND 12000
AND f1.licflag < 3;

SET ROLE restricted;

EXPLAIN ANALYZE
SELECT * 
FROM foo.bar f1
JOIN foo.baz f2 ON f1.id = f2.id 
WHERE f2.id BETWEEN 500 AND 12000;

结果

                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.87..87677.34 rows=4668 width=90) (actual time=0.091..45.337 rows=4622 loops=1)
   ->  Index Scan using baz_id_idx on baz f2  (cost=0.43..471.90 rows=11573 width=45) (actual time=0.042..4.496 rows=11501 loops=1)
         Index Cond: ((id >= 500) AND (id <= 12000))
   ->  Index Scan using bar_id_idx on bar f1  (cost=0.43..7.53 rows=1 width=45) (actual time=0.003..0.003 rows=0 loops=11501)
         Index Cond: (id = f2.id)
         Filter: (licflag < '3'::double precision)
         Rows Removed by Filter: 1
 Planning time: 1.300 ms
 Execution time: 45.826 ms
(9 rows)

SET
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=569.62..273628.35 rows=4227 width=90) (actual time=8.317..2074.996 rows=4558 loops=1)
   Hash Cond: (f1.id = f2.id)
   ->  Seq Scan on bar f1  (cost=0.00..218457.95 rows=3967891 width=45) (actual time=0.016..1616.577 rows=3998388 loops=1)
         Filter: (licflag < '3'::double precision)
         Rows Removed by Filter: 6001612
   ->  Hash  (cost=436.47..436.47 rows=10652 width=45) (actual time=8.033..8.033 rows=11501 loops=1)
         Buckets: 16384  Batches: 1  Memory Usage: 1027kB
         ->  Index Scan using baz_id_idx on baz f2  (cost=0.43..436.47 rows=10652 width=45) (actual time=0.026..4.871 rows=11501 loops=1)
               Index Cond: ((id >= 500) AND (id <= 12000))
 Planning time: 0.305 ms
 Execution time: 2075.371 ms
(11 rows)

psql (9.5.3, server 9.5.4)

更新1:

我在WHERE子句中使用RLS谓词运行查询,如

EXPLAIN ANALYZE
SELECT *
FROM foo.bar f1
JOIN foo.baz f2 ON f1.id = f2.id
WHERE f1.id BETWEEN 500 AND 12000
AND f1.licflag < 3;

它产生了更好的查询计划。但是当我删除这个额外的谓词时,计划者KEPT就是更好的计划。这让我觉得统计数据有问题..任何人都知道如何手动触发统计数据更新而不重置整个数据库的统计数据?现在自己去看Postgres文档......

更新2:

尝试设置全局和表级统计信息限制无效。能够使用子选择而不是具有类似查询的连接来获取正确的查询计划,因此可以将该技术用作解决方法。

0 个答案:

没有答案