我正在使用PostgreSQL的Ltree模块来存储分层数据。我正在寻找按特定列排序的完整层次结构。
考虑下表:
votes | path | ...
-------+-------+-----
1 | 1 | ...
2 | 1.1 | ...
4 | 1.2 | ...
1 | 1.2.1 | ...
3 | 2 | ...
1 | 2.1 | ...
2 | 2.1.1 | ...
4 | 2.1.2 | ...
... | ... | ...
在我当前的实现中,我将使用SELECT * FROM comments ORDER BY path
查询数据库,这将返回整个树:
Node 1
-- Node 1.1
-- Node 1.2
---- Node 1.2.1
Node 2
-- Node 2.1
---- Node 2.1.1
---- Node 2.1.2
但是,我希望按votes
排序(不是按id
排序,这是按path
排序的数量)。每个深度级别都需要独立排序,正确的树形结构保持不变。可以返回以下内容的东西:
Node 2
-- Node 2.1
---- Node 2.1.2
---- Node 2.1.1
Node 1
-- Node 1.2
---- Node 1.2.1
-- Node 1.1
Postgres'WITH RECURSIVE
可能合适,但我不确定。有什么想法吗?
答案 0 :(得分:12)
您使用WITH RECURSIVE
开启了正确的轨道。
WITH RECURSIVE t AS (
SELECT t.votes
, t.path
, 1::int AS lvl
, to_char(t2.votes, 'FM0000000') AS sort
FROM tbl t
JOIN tbl t2 ON t2.path = subltree(t.path, 0, 1)
UNION ALL
SELECT t.votes
, t.path
, t.lvl + 1
, t.sort || to_char(t2.votes, 'FM0000000')
FROM t
JOIN tbl t2 ON t2.path = subltree(t.path, 0, t.lvl + 1)
WHERE nlevel(t.path) > t.lvl
)
SELECT votes, path, max(sort) AS sort
FROM t
GROUP BY 1, 2
ORDER BY max(sort), path;
关键部分是用votes
的值替换路径的每个级别。因此,我们最后组装了一列ORDER BY
。这是必要的,因为路径的深度未知,我们无法通过静态SQL中未知数量的表达式进行排序。
为了获得稳定的排序,我使用to_char()
将votes
转换为带有前导零的字符串。我在演示中使用七位数,适用于低于10.000.000 的投票值。根据您的最高投票数进行调整。
在最后SELECT
我排除所有中间状态以消除重复。只剩下max(sort)
的最后一步。
这适用于带有递归CTE的标准SQL,但对大树不是很有效。 plpgsql函数以递归方式更新临时表中的排序路径而不创建临时欺骗可能会表现得更好。
仅适用于安装的ltree module。函数subltree(...)和nlevel(。)以及ltree日期类型不是标准PostgreSQL的一部分。
我的测试设置,方便您查看:
CREATE TEMP TABLE tbl(votes int, path ltree);
INSERT INTO tbl VALUES
(1, '1')
, (2, '1.1')
, (4, '1.2')
, (1, '1.2.1')
, (3, '2')
, (1, '2.1')
, (2, '2.1.1')
, (4, '2.1.2')
, (1, '2.1.3')
, (2, '3')
, (17, '3.3')
, (99, '3.2')
, (10, '3.1.1')
, (2345, '3.1.2')
, (1, '3.1.3');
大树应该更快。
CREATE OR REPLACE FUNCTION f_sorted_ltree()
RETURNS TABLE(votes int, path ltree) AS
$func$
DECLARE
lvl integer := 0;
BEGIN
CREATE TEMP TABLE t ON COMMIT DROP AS
SELECT tbl.votes
, tbl.path
, ''::text AS sort
, nlevel(tbl.path) AS depth
FROM tbl;
-- CREATE INDEX t_path_idx ON t (path); -- beneficial for huge trees
-- CREATE INDEX t_path_idx ON t (depth);
LOOP
lvl := lvl + 1;
UPDATE t SET sort = t.sort || to_char(v.votes, 'FM0000000')
FROM (
SELECT t2.votes, t2.path
FROM t t2
WHERE t2.depth = lvl
) v
WHERE v.path = subltree(t.path, 0 ,lvl);
EXIT WHEN NOT FOUND;
END LOOP;
-- Return sorted rows
RETURN QUERY
SELECT t.votes, t.path
FROM t
ORDER BY t.sort;
END
$func$ LANGUAGE plpgsql VOLATILE;
呼叫:
SELECT * FROM f_sorted_ltree();
请阅读manual about setting temp_buffers
。
我很感兴趣,它可以更快地处理您的现实生活数据。
答案 1 :(得分:0)
create table comments (
id serial,
parent_id int,
msg text,
primary key (id)
);
insert into comments (id, parent_id, msg) values (1, null, 'msg 1');
insert into comments (id, parent_id, msg) values (2, null, 'msg 2');
insert into comments (id, parent_id, msg) values (3, 1, 'msg 1 / ans 1');
insert into comments (id, parent_id, msg) values (4, null, 'msg 3');
insert into comments (id, parent_id, msg) values (5, 2, 'msg 2 / ans 1');
insert into comments (id, parent_id, msg) values (6, 2, 'msg 2 / ans 2');
insert into comments (id, parent_id, msg) values (7, 2, 'msg 2 / ans 3');
递减
WITH RECURSIVE q AS
(
SELECT id, msg, 1 as level, ARRAY[id] as path
FROM comments c
WHERE parent_id is null
UNION ALL
SELECT sub.id, sub.msg, level + 1, path || sub.id
FROM q
JOIN comments sub
ON sub.parent_id = q.id
)
SELECT id, msg, level
FROM q
order by path || array_fill(100500, ARRAY[8 - level]) desc;
结果
4,"msg 3",1
2,"msg 2",1
7,"msg 2 / ans 3",2
6,"msg 2 / ans 2",2
5,"msg 2 / ans 1",2
1,"msg 1",1
3,"msg 1 / ans 1",2
ASC
WITH RECURSIVE q AS
(
SELECT id, msg, 1 as level, ARRAY[id] as path
FROM comments c
WHERE parent_id is null
UNION ALL
SELECT sub.id, sub.msg, level + 1, path || sub.id
FROM q
JOIN comments sub
ON sub.parent_id = q.id
)
SELECT id, msg, level
FROM q
--order by path || array_fill(100500, ARRAY[8 - level]) desc;
order by path;
结果
1,"msg 1",1
3,"msg 1 / ans 1",2
2,"msg 2",1
5,"msg 2 / ans 1",2
6,"msg 2 / ans 2",2
7,"msg 2 / ans 3",2
4,"msg 3",1