如何在树中找到最新节点的不同根节点(在闭包表中保存)?

时间:2011-02-10 14:52:04

标签: sql tree hierarchical-data

我尝试将消息树存储为MySQL上的闭包表。大部分都是从Bill Karwin的演讲中了解到这种方法Models for hierarchical data。问题是:我想找到不同的3个根节点(=没有祖先的节点),它们的树中有最新的节点。 NB!即使某个根节点在其子树中有10个最新节点,它也只计算一次。

所有节点都有自己的修改时间,为简单起见,我们可以说节点ID也代表了它们上次修改的时间(但我们不能在查询中使用id作为时间),第一个是第一个,最后一个是第17个。

1
 2
  8
   15
   16
  17
 7
3
 4
  5
 6
9
 12
10
 14
11
 13

在闭包表中我有3列(祖先,后代,深度),因此树呈现如下:

| ancestor | descendant | depth |
+----------+------------+-------+
|        1 |          1 |     0 | 
|        1 |          2 |     1 | 
|        1 |          7 |     1 | 
|        1 |          8 |     2 | 
|        1 |         15 |     3 | 
|        1 |         16 |     3 | 
|        1 |         17 |     2 | 
|        2 |          2 |     0 | 
|        2 |          8 |     1 | 
|        2 |         15 |     2 | 
|        2 |         16 |     2 | 
|        2 |         17 |     1 | 
|        3 |          3 |     0 | 
|        3 |          4 |     1 | 
|        3 |          5 |     2 | 
|        3 |          6 |     1 | 
|        4 |          4 |     0 | 
|        4 |          5 |     1 | 
|        5 |          5 |     0 | 
|        6 |          6 |     0 | 
|        7 |          7 |     0 | 
|        8 |          8 |     0 | 
|        8 |         15 |     1 | 
|        8 |         16 |     1 | 
|        9 |          9 |     0 | 
|        9 |         12 |     1 | 
|       10 |         10 |     0 | 
|       10 |         14 |     1 | 
|       11 |         11 |     0 | 
|       11 |         13 |     1 | 
|       12 |         12 |     0 | 
|       13 |         13 |     0 | 
|       14 |         14 |     0 | 
|       15 |         15 |     0 | 
|       16 |         16 |     0 | 
|       17 |         17 |     0 | 

我可以得到这样的最新子树:

SELECT c.ancestor, MAX(time) AS t 
FROM closure c 
    JOIN nodes n ON (c.descendant = n.node AND c.ancestor <> n.node) 
GROUP BY c.ancestor ORDER BY t desc;

但是我怎样才能获得具有最新帖子的明显3 root 节点(在这种情况下为1,10和11)?这可能是一个查询的可能(和理性)吗?


编辑:我放sample tables to pastebin

6 个答案:

答案 0 :(得分:2)

我得到了一些解决方案。 “有点”,因为我必须在nodes-table:root中使用额外的列。它表示节点是否是根节点。使用这个附加位我可以撰写这样的查询:

SELECT c.ancestor, MAX(n.time) AS t FROM closure c
    JOIN nodes n ON (c.descendant = n.node AND c.ancestor <> n.node)
    JOIN nodes n2 ON (c.ancestor = n2.node AND n2.root = 1) 
    GROUP BY c.ancestor ORDER BY t desc LIMIT 3;

对我来说,它表现得非常好。它也可以扩展。我生成了100000节点的树,花了大约1秒才得到结果(最大树深度为18)。

我附加了用于内容生成的perl脚本(和表模式),因此也许有些人可以调整此查询以更好地执行。

#!/usr/bin/perl --

use strict;
use warnings;
use Data::Random qw(:all);
my ($maxnode, $node) = ();

my $dbh = !DATABASE INIT!

foreach ( 1 .. $ARGV[0] ) {
    $node = ($_ == 1) ? 0 : int( rand(4) );

    if (!$node) {
        $maxnode = &RootNode(1);
    }
    else {
        $maxnode = &Node($maxnode);
    }
}


##
## 
sub Node {
my $parent = int( rand($_[0]) ) + 1;

my $id = &RootNode(0, $parent);

my $insert = qq|INSERT INTO test.closure (ancestor, descendant, depth) 
        SELECT ancestor, $id, depth + 1 
        FROM test.closure WHERE descendant = ?|;
$dbh->do($insert, undef, $parent);
return $id;

}
##


##
## 
sub RootNode {
my $min_datetime = $_[0] 
        ? '2008-9-21 4:0:0' 
        :  $dbh->selectrow_array( "SELECT time 
                FROM test.nodes WHERE node = ?", undef, $_[1] );
my $label = join( "", rand_chars( set => 'alpha', min => 5, max => 20 ) );
my $time = rand_datetime( min => $min_datetime, max => 'now' );

my $insert = qq|INSERT INTO test.nodes (label, time, root) VALUES (?, ?, ?)|;
$dbh->do($insert, undef, $label, $time, $_[0]);
my ($id) = $dbh->selectrow_array("SELECT LAST_INSERT_ID()");

$insert = qq|INSERT INTO test.closure (ancestor, descendant, depth) 
        VALUES (?, ?, 0)|;
$dbh->do($insert, undef, $id, $id);

return $id;
}
##

__DATA__
USE test

DROP TABLE IF EXISTS `closure`;
DROP TABLE IF EXISTS `nodes`;

CREATE TABLE `nodes` (
`node` int(11) NOT NULL auto_increment,
`label` varchar(20) NOT NULL,
`time` datetime default NULL,
`root` tinyint(1) unsigned default NULL,
PRIMARY KEY  (`node`)
) ENGINE=InnoDB;

CREATE TABLE `closure` (
`ancestor` int(11) NOT NULL,
`descendant` int(11) NOT NULL,
`depth` tinyint(3) unsigned default NULL,
PRIMARY KEY  (`ancestor`,`descendant`),
KEY `descendant` (`descendant`),
CONSTRAINT `closure_ibfk_1` FOREIGN KEY (`ancestor`) REFERENCES `nodes` (`node`),
CONSTRAINT `closure_ibfk_2` FOREIGN KEY (`descendant`) REFERENCES `nodes` (`node`)
) ENGINE=InnoDB;

答案 1 :(得分:2)

您可以创建一个顶级元素,仅供参考,所有后代都是根节点。

  • 顶部
      • SUB1
      • SUB2
      • SUB3
    • 根-2
    • root3
    • root4

答案 2 :(得分:1)

我尝试将其模拟到数据库中,然后生成此查询以查找具有最新发布的最后3个根节点。我不确定我是否理解了你的所有要求,但如果我不这样做,请告诉我,我会尽快帮助你。

我的查询如下:


SELECT TOP 3 QRY_GROUP_ALL_OF_THEM.MínDeancestor, Max(QRY_GROUP_ALL_OF_THEM.descendant) AS MáxDedescendant
FROM (  SELECT Min(closure.ancestor) AS MínDeancestor, [QRY_LAST_INSERTIONS].[descendant]
    FROM closure,   (SELECT DISTINCT closure.descendant 
        FROM closure 
        GROUP BY closure.descendant, closure.depth, closure.ancestor, closure.descendant 
        HAVING  (((closure.descendant>12 And closure.descendant<>[closure].[ancestor]) AND (closure.depth<>0)) 
            OR ((closure.descendant<>[closure].[ancestor]) AND (closure.depth<>0)))
        ) AS QRY_LAST_INSERTIONS
    GROUP BY closure.descendant, [QRY_LAST_INSERTIONS].[descendant]
    HAVING (((closure.descendant)=[QRY_LAST_INSERTIONS].[descendant]))
) AS QRY_GROUP_ALL_OF_THEM
GROUP BY QRY_GROUP_ALL_OF_THEM.MínDeancestor
ORDER BY Max(QRY_GROUP_ALL_OF_THEM.descendant) DESC;

如您所见,同一个查询中有三个查询。 如果它对您有用,请告诉我,我将在明天解释它是如何工作的。

祝你好运, 霍尔迪马斯

答案 3 :(得分:1)

这里你有相同的代码,别名中没有引号,请检查并告诉我它是否适合你。我在Microsoft SQL Server下尝试过,因为我的笔记本电脑上没有MySQL服务器,但如果它不起作用,请告诉我,我会安装并试用它。

查询:

SELECT TOP 3 QRY_GROUP_ALL_OF_THEM.MinAncestor, Max(QRY_GROUP_ALL_OF_THEM.descendant) AS MaxDescendant
FROM (  
SELECT Min(closure.ancestor) AS MinAncestor, [QRY_LAST_INSERTIONS].[descendant]     
FROM closure, (
    SELECT DISTINCT closure.descendant          
    FROM closure          
    GROUP BY closure.descendant, closure.depth, closure.ancestor, closure.descendant          
    HAVING  (((closure.descendant>12 And closure.descendant<>[closure].[ancestor]) 
        AND (closure.depth<>0))              
        OR ((closure.descendant<>[closure].[ancestor]) AND (closure.depth<>0)))         
) AS QRY_LAST_INSERTIONS     
GROUP BY closure.descendant, [QRY_LAST_INSERTIONS].[descendant]     
HAVING (((closure.descendant)=[QRY_LAST_INSERTIONS].[descendant])) ) AS QRY_GROUP_ALL_OF_THEM 
GROUP BY QRY_GROUP_ALL_OF_THEM.MinAncestor 
ORDER BY Max(QRY_GROUP_ALL_OF_THEM.descendant) DESC; 

此查询的结果与您的数据如下:

MinAncestor:1,10,11

MaxDescendant:17,14,13

我希望它会对你有所帮助。


在你对TOP语句的评论(它不适用于MySQL)之后,最终的查询必须是这个:

SELECT 
    QRY_GROUP_ALL_OF_THEM.MinAncestor, 
    Max(QRY_GROUP_ALL_OF_THEM.descendant) AS MaxDescendant LIMIT 0,3
FROM 
    (  
        SELECT 
            Min(closure.ancestor) AS MinAncestor, 
            [QRY_LAST_INSERTIONS].[descendant]     
        FROM closure, 
            (
                SELECT DISTINCT closure.descendant 
                FROM   closure 
                GROUP  BY closure.descendant, 
                          closure.depth, 
                          closure.ancestor, 
                          closure.descendant 
                HAVING ( ( ( closure.descendant > 12 
                             AND closure.descendant <> [closure].[ancestor] ) 
                           AND ( closure.depth <> 0 ) ) 
                          OR ( ( closure.descendant <> [closure].[ancestor] ) 
                               AND ( closure.depth <> 0 ) ) )        
            ) AS QRY_LAST_INSERTIONS     
        GROUP BY 
            closure.descendant, 
            [QRY_LAST_INSERTIONS].[descendant]     
        HAVING (((closure.descendant)=[QRY_LAST_INSERTIONS].[descendant])) 
    ) AS QRY_GROUP_ALL_OF_THEM 
GROUP BY QRY_GROUP_ALL_OF_THEM.MinAncestor 
ORDER BY Max(QRY_GROUP_ALL_OF_THEM.descendant) DESC;

答案 4 :(得分:1)

这个帖子已经相当陈旧,但在提出一个除了标准祖先&amp; amp;之外不需要额外列的解决方案之前,我偶然发现了它。后代,你甚至不需要时间,因为OP自己陈述了问题:你想要的祖先是其他的后代。下面是最终查询,下面是测试数据,可以自己试用。

select a.node_name, a.node_id
from test.hier a left outer join 
             (select coo.descendant /* coo = CHILD OF OTHER */
              from test.closure_tree coo right outer join test.closure_tree ro
                    on coo.ancestor <> ro.descendant /* ignore its self reference */
                    and coo.descendant = ro.descendant /* belongs to another node besides itself */)lo 
on a.node_id = lo.descendant
where lo.descendant is null /* wasn't found to be a child of another node besides itself */
group by a.node_name, a.node_id

测试数据脚本以加载此测试层次结构:

--create table test.hier (
--  node_name varchar(10), 
--  node_id int identity (1,1) primary key
--)     

--insert into test.hier (node_name)
--values ('ROOT1')
--insert into test.hier (node_name)
--values ('ROOT2')
--insert into test.hier (node_name)
--values ('ROOT3')
--insert into test.hier (node_name)
--values ('ChildOf1')
--insert into test.hier (node_name)
--values ('ChildOf1')
--insert into test.hier (node_name)
--values ('ChildOf1')
--insert into test.hier (node_name)
--values ('ChildOf1')
--insert into test.hier (node_name)
--values ('ChildOf1')
--insert into test.hier (node_name)
--values ('ChildOf2')
--insert into test.hier (node_name)
--values ('ChildOf2')
--insert into test.hier (node_name)
--values ('ChildOf3')
--insert into test.hier (node_name)
--values ('ChildOf3')
--insert into test.hier (node_name)
--values ('ChildOf3')
--insert into test.hier (node_name)
--values ('ChildOf3')
--insert into test.hier (node_name)
--values ('LeafOf3')
--insert into test.hier (node_name)
--values ('LeafOf3')
--insert into test.hier (node_name)
--values ('LeafOf3')
--insert into test.hier (node_name)
--values ('LeafOf3')
--insert into test.hier (node_name)
--values ('LeafOf1')
--insert into test.hier (node_name)
--values ('LeafOf2')

--create table test.closure_tree (
--  ancestor int, 
--  descendant int, 
--  PRIMARY KEY (ancestor, descendant), 
--  constraint fk_test_a FOREIGN KEY (ancestor) references test.hier (node_id), 
--  constraint fk_test_d FOREIGN KEY (descendant) references test.hier (node_id)
--)

-- SELF REFERENCES 
--insert into test.closure_tree (ancestor, descendant)
--select node_id as a, node_id as d
--from test.hier

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ROOT1' 
--      and b.node_name = 'ChildOf1'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ROOT2' 
--      and b.node_name = 'ChildOf2'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ROOT3' 
--      and b.node_name = 'ChildOf3'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ChildOf3' 
--      and b.node_name = 'LeafOf3'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ROOT3' 
--      and b.node_name = 'LeafOf3'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ROOT1' 
--      and b.node_name = 'LeafOf1'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ChildOf1' 
--      and b.node_name = 'LeafOf1'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'ChildOf2' 
--      and b.node_name = 'LeafOf2'

--insert into test.closure_tree (ancestor, descendant)
--select a.node_id, b.node_id 
--from test.hier a join test.hier b
--      on a.node_name = 'Root2' 
--      and b.node_name = 'LeafOf2'


---- Test read of hierarchy with weird ordering for human readability
--select a.node_name, b.node_name as descendant_node_name 
--from test.hier a join test.closure_tree c
--  on a.node_id = c.ancestor
--  join test.hier b
--  on c.descendant = b.node_id
--order by right(a.node_name, 1), left(a.node_name, 1) desc

答案 5 :(得分:0)

select x.ancestor
from nodes n
join closure c on (c.descendant = n.node)
join (
-- all root node
   select ancestor 
   from closure
   group by descendant 
   having count(*) = 1
) x ON x.ancestor = c.ancestor
where c.depth = 1
order by n.time desc
limit 3