PostgreSQL-对混杂数据进行排序(递归吗?)

时间:2018-11-22 04:51:35

标签: postgresql

我在PostgreSQL数据库中有一些无序数据,如下所示:

ID  PATH    START   END
7   A123    B       C
4   B456    D       E
9   A123    A       B
2   B456    A       B
6   B456    B       C
21  A123    C       D
3   B456    C       D
5   B456    E       F

START和END值不能按字母顺序排序,它们只是用来说明问题

我正在努力实现这一目标:

id  path    sequence    start   end
9   A123    1           A       B
7   A123    2           B       C
21  A123    3           C       D
2   B456    1           A       B
6   B456    2           B       C
3   B456    3           C       D
4   B456    4           D       E
5   B456    5           E       F

我正在考虑实现此逻辑的逻辑,以确定每个路径(A123 / B456)的起始值(表示为A)。然后确定序列AB,BC,CD等)。所有路径都需要重复。

我编写了一个循环查询,该查询遍历给定的路径名​​(请参阅WHERE path ='B456')

WITH RECURSIVE ordered(id, path, sequence, "start", "end") AS (
WITH path AS (SELECT id, "path", "start", "end"
FROM unordered
WHERE path = 'B456'),
startofpath AS (SELECT p1.id
FROM unordered p1
LEFT JOIN unordered p2 ON p1.start = p2.end
WHERE p2.start IS NULL)

--find start of path (A)
SELECT path.id, path.path, 1, path.start, path.end
FROM path, startofpath
WHERE path.id = startofpath.id
UNION ALL
--add on next path (B -> C)
SELECT path.id, path.path, ordered.sequence + 1, path.start, path.end FROM 
path
INNER JOIN ordered
ON path.start = ordered."end")
SELECT * FROM ordered

样本数据:

CREATE table unordered (
id   INT PRIMARY KEY,
path TEXT NOT NULL,
"start" TEXT NOT NULL,
"end" TEXT NOT NULL);

INSERT INTO unordered (id, path, "start", "end") VALUES (7,'A123','B','C');
INSERT INTO unordered (id, path, "start", "end") VALUES (4,'B456','D','E');
INSERT INTO unordered (id, path, "start", "end") VALUES (9,'A123','A','B');
INSERT INTO unordered (id, path, "start", "end") VALUES (2,'B456','A','B');
INSERT INTO unordered (id, path, "start", "end") VALUES (6,'B456','B','C');
INSERT INTO unordered (id, path, "start", "end") VALUES (21,'A123','C','D');
INSERT INTO unordered (id, path, "start", "end") VALUES (3,'B456','C','D');
INSERT INTO unordered (id, path, "start", "end") VALUES (5,'B456','E','F');

然后我要解决的问题是如何遍历所有路径(A123,然后是B456等)

有人可以协助进行下一步吗? (或者,如果我有完全错误的想法,请重新进行查询)

非常感谢!

1 个答案:

答案 0 :(得分:0)

这是您要寻找的东西吗?

WITH RECURSIVE
get_path(id, path, sequence, starting, ending) AS (
    SELECT u.id, u.path, 1, u.starting, u.ending 
    FROM unordered AS u
    WHERE u.starting NOT IN (SELECT u.ending FROM unordered AS u) -- a starting point has no entry in ending column

    UNION

    SELECT u.id, u.path, g.sequence + 1, g.ending, u.ending
    FROM get_path AS g, unordered AS u
    WHERE u.starting = g.ending
)
TABLE get_path ORDER BY path, sequence;

请注意,我将"start"更改为starting,并将"end"更改为ending