我试图将类似ACL的数据存储到表中,并检查特定路径是否与任何存储的模式匹配。
我在MySQL和PostgreSQL上都进行了测试。
有我的桌子和(BTREE)索引:
create table acl (id serial, pattern text, block bool);
create index acl_pattern on acl(pattern);
我首先尝试像这样存储通配符,它可以工作,但是我找不到使用索引的方法,我认为这是不可能的:
insert into acl values (default, '/public/%', false);
insert into acl values (default, '/admin/%', true);
select * from acl where '/public/hello' like pattern;
由于大多数(如果不是全部)模式将仅是前缀,因此我尝试通过执行以下操作来避免使用通配符,但我也不能使用索引:
insert into acl values (default, '/public/', false);
insert into acl values (default, '/admin/', true);
// PostgreSQL
test=# explain analyze select block from acl where pattern = substring('/public/blabla', 0, length(pattern)+1);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on acl (cost=10000000000.00..10000000001.04 rows=1 width=1) (actual time=0.058..0.059 rows=1 loops=1)
Filter: (pattern = "substring"('/public/blabla'::text, 0, (length(pattern) + 1)))
Rows Removed by Filter: 1
Planning Time: 0.074 ms
Execution Time: 0.085 ms
(5 rows)
test=# explain analyze select block from acl where pattern = 'test';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Index Scan using acl_pattern on acl (cost=0.13..8.14 rows=1 width=1) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: (pattern = 'test'::text)
Planning Time: 0.147 ms
Execution Time: 1.063 ms
(4 rows)
// MySQL
mysql> explain select block from acl where pattern = left('/public/blabla', length(pattern));
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | acl | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select block from acl where pattern = "hello";
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | acl | NULL | ref | acl_pattern | acl_pattern | 1019 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
当我用静态值替换比较的正确值时,索引正确使用了,看起来像调用函数或使用正确值上的pattern字段会使索引的使用无效?
我还尝试了使用CockroachDB进行比较(查询与PostgreSQL完全相同),并且得到了完全相同的行为:
root@:26257/defaultdb> explain select block from acl where pattern = substring('/public/blabla', 0, length(pattern)+1);
tree | field | description
+-----------+--------+---------------------------------------------------------------+
render | |
└── scan | |
| table | acl@primary
| spans | ALL
| filter | pattern = substring('/public/blabla', 0, length(pattern) + 1)
root@:26257/defaultdb> explain select block from acl where pattern = 'hello';
tree | field | description
+-----------------+-------+-----------------------------+
render | |
└── index-join | |
├── scan | |
│ | table | acl@acl_pattern
│ | spans | /"hello"-/"hello"/PrefixEnd
└── scan | |
| table | acl@primary
答案 0 :(得分:1)
似乎无法使用索引,因为右边的表达式取决于pattern
(因此需要从表中读取)。
假设您可以确定模式的最小长度(例如6个字符),则可以尝试执行以下操作:
create index acl_pattern on acl(left(pattern, 6));
select *
from acl
where left(pattern, 6) = left('/public/something', 6) and '/public/something' like pattern
答案 1 :(得分:1)
要使用LIKE,您的索引缺少text_pattern_ops运算符。关于字符,Postgres有点特殊,它处理btree的方式意味着行为会因设置而有所不同,因此您可能需要仔细阅读。 TLDR您的索引应该像这样使用LIKE:
create index acl_pattern on acl(pattern text_pattern_ops);
https://www.postgresql.org/docs/11/indexes-opclass.html
另一个问题是Postgres有一个查询计划器,因此,如果您的表只有2行,则它不会考虑检查索引的第一个成本,因为索引很可能会告诉它仅引用它的页面这两行都在表上。
答案 2 :(得分:0)
(从MySQL的角度来看。我不会讲postgres。)
SELECT ...
FROM ...
WHERE pattern <= '/public/blabla'
ORDER BY pattern DESC
LIMIT 1
->
SELECT ...
FROM ( SELECT --- as above ) AS x
WHERE pattern = LEFT('/public/blabla', CHAR_LENGTH(pattern))
那将使您成为O(1)时间中的第一个匹配行。否则会给您不匹配的东西。现在,让我们检查一下:
{{1}}
这将传递1行或空集。