就运行时而言,有向图的最着名的传递闭包算法是什么?
我目前正在使用Warshall的算法但它的O(n ^ 3)。虽然,由于图形表示我的实现稍微好一点(而不是检查所有边缘,它只检查所有前进边缘)。有没有比这更好的传递闭包算法?特别是,有没有专门针对共享内存多线程架构的任何内容?
答案 0 :(得分:14)
本文讨论了各种传递闭包算法的性能:
http://www.vldb.org/conf/1988/P382.PDF
本文的一个有趣想法是避免在图表发生变化时重新计算整个闭包。
Esko Nuutila也有这个页面,它列出了几个最近的算法:
http://www.cs.hut.fi/~enu/tc.html
他在该页面上列出的博士论文可能是最佳起点:
http://www.cs.hut.fi/~enu/thesis.html
从该页面开始:
实验也表明用区间表示 和新算法,可以计算传递闭包 通常在时间上与输入图的大小成线性关系。
答案 1 :(得分:11)
Algorithm Design manual有一些有用的信息。要点:
答案 2 :(得分:0)
令人惊讶的是,我找不到 STACK_TC
算法 described by Esko Nuutila 的任何实现(由 AmigoNico 在另一个答案中链接)。
所以我用 C++ 编写了自己的简单实现。与原算法略有不同,解释见代码注释。
它成功地通过了我尝试的一些测试,但鼓励读者对其进行更多测试,并根据原始论文进行验证。代码可能优化不足。
struct TransitiveClosure
{
// The nodes of the graph are grouped into 'components'.
// In a component, each node is reachable (possibly indirectly) from every other node.
// If the number of components is same as the number of nodes, your graph has no cycles.
// Otherwise there will be less components.
// The components form a graph called 'condensation graph', which is always acyclic.
// The components are numbered in a way that 'B is reachable from A' implies `B <= A`.
struct Node
{
// An arbitrarily selected node in the same component. Same for all nodes in this component.
std::size_t root = -1;
// Index if the component this node belongs to.
std::size_t comp = -1;
};
std::vector<Node> nodes; // Size is the number of nodes, which was passed in the argument.
struct Component
{
// Nodes that are a part of this component.
std::vector<std::size_t> nodes;
// Which components are reachable (possibly indirectly) from this one.
// Unordered, but has no duplicates. May or may not contain itself.
std::vector<std::size_t> next;
// A convenicene array.
// `next_contains[i]` is 1 if and only if `next` contains `i`.
// Some trailing zeroes might be missing, check the size before accessing it.
std::vector<unsigned char/*boolean*/> next_contains;
};
std::vector<Component> components; // Size is the number of components.
};
[[nodiscard]] TransitiveClosure ComputeTransitiveClosure(std::size_t n, std::function<bool(std::size_t a, std::size_t b)> have_edge_from_to)
{
// Implementation of the 'STACK_TC' algorithm, described by Esko Nuutila (1995), in
// 'Efficient Transitive Closure Computation in Large Digraphs'.
constexpr std::size_t nil = -1;
TransitiveClosure ret;
ret.nodes.resize(n);
std::vector<std::size_t> vstack, cstack; // Vertex and component stacks.
vstack.reserve(n);
cstack.reserve(n);
auto StackTc = [&](auto &StackTc, std::size_t v)
{
if (ret.nodes[v].root != nil)
return; // We already visited `v`.
ret.nodes[v].root = v;
ret.nodes[v].comp = nil;
vstack.push_back(v);
std::size_t saved_height = cstack.size();
bool self_loop = false;
for (std::size_t w = 0; w < n; w++)
{
if (!have_edge_from_to(v, w))
continue;
if (v == w)
{
self_loop = true;
}
else
{
StackTc(StackTc, w);
if (ret.nodes[w].comp == nil)
ret.nodes[v].root = std::min(ret.nodes[v].root, ret.nodes[w].root);
else
cstack.push_back(ret.nodes[w].comp);
// The paper that this is based on had an extra condition on this last `else` branch,
// which I wasn't able to understand: `if (v,w) is not a forward edge`.
// However! Ivo Gabe de Wolff (2019) in "Higher ranked region inference for compile-time garbage collection"
// says that it doesn't affect correctness:
// > In the loop over the successors, the original algorithm Stack_TC checks whether
// > an edge is a so called forward edge. We do not perform this check, which may cause
// > that a component is pushed multiple times to cstack. As duplicates are removed in the
// > topological sort, these will be removed later on and not cause problems with correctness.
}
}
if (ret.nodes[v].root == v)
{
std::size_t c = ret.components.size();
ret.components.emplace_back();
TransitiveClosure::Component &this_comp = ret.components.back();
this_comp.next_contains.assign(ret.components.size(), false); // Sic.
if (vstack.back() != v || self_loop)
{
this_comp.next.push_back(c);
this_comp.next_contains[c] = true;
}
// Topologically sort a part of the component stack.
std::sort(cstack.begin() + saved_height, cstack.end(), [&comp = ret.components](std::size_t a, std::size_t b) -> bool
{
if (b >= comp[a].next_contains.size())
return false;
return comp[a].next_contains[b];
});
// Remove duplicates.
cstack.erase(std::unique(cstack.begin() + saved_height, cstack.end()), cstack.end());
while (cstack.size() != saved_height)
{
std::size_t x = cstack.back();
cstack.pop_back();
if (!this_comp.next_contains[x])
{
if (!this_comp.next_contains[x])
{
this_comp.next.push_back(x);
this_comp.next_contains[x] = true;
}
this_comp.next.reserve(this_comp.next.size() + ret.components[x].next.size());
for (std::size_t c : ret.components[x].next)
{
if (!this_comp.next_contains[c])
{
this_comp.next.push_back(c);
this_comp.next_contains[c] = true;
}
}
}
}
std::size_t w;
do
{
w = vstack.back();
vstack.pop_back();
ret.nodes[w].comp = c;
this_comp.nodes.push_back(w);
}
while (w != v);
}
};
for (std::size_t v = 0; v < n; v++)
StackTc(StackTc, v);
return ret;
}
我的测试用例(来自同一篇论文):
输入:(邻接矩阵,Y为边源,X为边目的)
{0,1,0,0,0,0,0,0},
{0,0,1,1,0,0,0,0},
{1,0,0,1,0,0,0,0},
{0,0,0,0,1,1,0,0},
{0,0,0,0,0,0,1,0},
{0,0,0,0,1,0,0,1},
{0,0,0,0,1,0,0,0},
{0,0,0,0,0,1,0,0},
输出:
{nodes=[(0,3),(0,3),(0,3),(3,2),(4,0),(5,1),(4,0),(5,1)],components=[{nodes=[6,4],next=[0],next_contains=[1]},{nodes=[7,5],next=[1,0],next_contains=[1,1]},{nodes=[3],next=[0,1],next_contains=[1,1,0]},{nodes=[2,1,0],next=[3,2,0,1],next_contains=[1,1,1,1]}]}
输入:
// a b c d e f g h i j
/* a */{0,1,0,0,0,1,0,1,0,0},
/* b */{1,0,1,0,0,0,0,0,0,0},
/* c */{0,1,0,1,0,0,0,0,0,0},
/* d */{0,0,0,0,1,0,0,0,0,0},
/* e */{0,0,0,1,0,0,0,0,0,0},
/* f */{0,0,0,0,0,0,1,0,0,0},
/* g */{0,0,0,1,0,1,0,0,0,0},
/* h */{0,0,0,0,0,0,0,0,1,0},
/* i */{0,0,1,0,1,0,0,1,0,1},
/* j */{0,0,0,0,0,0,0,0,0,0},
输出:
{nodes=[(0,3),(0,3),(0,3),(3,0),(3,0),(5,1),(5,1),(0,3),(0,3),(9,2)],components=[{nodes=[4,3],next=[0],next_contains=[1]},{nodes=[6,5],next=[1,0],next_contains=[1,1]},{nodes=[9],next=[],next_contains=[0,0,0]},{nodes=[8,7,2,1,0],next=[3,2,0,1],next_contains=[1,1,1,1]}]}