我正在尝试实现用于Kruskal算法的Disjoint Sets,但我无法确切地了解它应该如何完成,特别是如何管理树林。在阅读了维基百科关于不相交集的描述之后,在阅读了算法导论(Cormen等人)中的描述后,我得出了以下内容:
class DisjointSet
{
public:
class Node
{
public:
int data;
int rank;
Node* parent;
Node() : data(0),
rank(0),
parent(this) { } // the constructor does MakeSet
};
Node* find(Node*);
Node* merge(Node*, Node*); // Union
};
DisjointSet::Node* DisjointSet::find(DisjointSet::Node* n)
{
if (n != n->parent) {
n->parent = find(n->parent);
}
return n->parent;
}
DisjointSet::Node* DisjointSet::merge(DisjointSet::Node* x,
DisjointSet::Node* y)
{
x = find(x);
y = find(y);
if (x->rank > y->rank) {
y->parent = x;
} else {
x->parent = y;
if (x->rank == y->rank) {
++(y->rank);
}
}
}
我很确定这是不完整的,我错过了一些东西。
算法简介提到应该有一片树林,但它没有给出这片林的实际实施的任何解释。我观看了CS 61B第31讲:不相交集(http://www.youtube.com/watch?v=wSPAjGfDl7Q),这里讲师只使用一个数组来存储森林及其所有树和值。我提到的没有明确的'Node'类型。我还发现了许多其他来源(我不能发布多个链接),这也使用了这种技术。我很乐意这样做,除了这依赖于数组的索引进行查找,因为我想存储除int之外的类型的值,我需要使用别的东西(std :: map可以想到)。 / p>
我不确定的另一个问题是内存分配,因为我使用的是C ++。我存储指针树,我的MakeSet操作将是:new DisjointSet :: Node; 。现在,这些节点只有指向父母的指针,所以我不确定如何找到树的底部。我怎样才能遍历我的树木以解除所有这些?
我理解这个数据结构的基本概念,但我对实现有点困惑。非常欢迎任何意见和建议,谢谢。
答案 0 :(得分:4)
无论如何都不是一个完美的实现(毕竟我写了它!),但这有帮助吗?
/***
* millipede: DisjointSetForest.h
* Copyright Stuart Golodetz, 2009. All rights reserved.
***/
#ifndef H_MILLIPEDE_DISJOINTSETFOREST
#define H_MILLIPEDE_DISJOINTSETFOREST
#include <map>
#include <common/exceptions/Exception.h>
#include <common/io/util/OSSWrapper.h>
#include <common/util/NullType.h>
namespace mp {
/**
@brief A disjoint set forest is a fairly standard data structure used to represent the partition of
a set of elements into disjoint sets in such a way that common operations such as merging two
sets together are computationally efficient.
This implementation uses the well-known union-by-rank and path compression optimizations, which together
yield an amortised complexity for key operations of O(a(n)), where a is the (extremely slow-growing)
inverse of the Ackermann function.
The implementation also allows clients to attach arbitrary data to each element, which can be useful for
some algorithms.
@tparam T The type of data to attach to each element (arbitrary)
*/
template <typename T = NullType>
class DisjointSetForest
{
//#################### NESTED CLASSES ####################
private:
struct Element
{
T m_value;
int m_parent;
int m_rank;
Element(const T& value, int parent)
: m_value(value), m_parent(parent), m_rank(0)
{}
};
//#################### PRIVATE VARIABLES ####################
private:
mutable std::map<int,Element> m_elements;
int m_setCount;
//#################### CONSTRUCTORS ####################
public:
/**
@brief Constructs an empty disjoint set forest.
*/
DisjointSetForest()
: m_setCount(0)
{}
/**
@brief Constructs a disjoint set forest from an initial set of elements and their associated values.
@param[in] initialElements A map from the initial elements to their associated values
*/
explicit DisjointSetForest(const std::map<int,T>& initialElements)
: m_setCount(0)
{
add_elements(initialElements);
}
//#################### PUBLIC METHODS ####################
public:
/**
@brief Adds a single element x (and its associated value) to the disjoint set forest.
@param[in] x The index of the element
@param[in] value The value to initially associate with the element
@pre
- x must not already be in the disjoint set forest
*/
void add_element(int x, const T& value = T())
{
m_elements.insert(std::make_pair(x, Element(value, x)));
++m_setCount;
}
/**
@brief Adds multiple elements (and their associated values) to the disjoint set forest.
@param[in] elements A map from the elements to add to their associated values
@pre
- None of the elements to be added must already be in the disjoint set forest
*/
void add_elements(const std::map<int,T>& elements)
{
for(typename std::map<int,T>::const_iterator it=elements.begin(), iend=elements.end(); it!=iend; ++it)
{
m_elements.insert(std::make_pair(it->first, Element(it->second, it->first)));
}
m_setCount += elements.size();
}
/**
@brief Returns the number of elements in the disjoint set forest.
@return As described
*/
int element_count() const
{
return static_cast<int>(m_elements.size());
}
/**
@brief Finds the index of the root element of the tree containing x in the disjoint set forest.
@param[in] x The element whose set to determine
@pre
- x must be an element in the disjoint set forest
@throw Exception
- If the precondition is violated
@return As described
*/
int find_set(int x) const
{
Element& element = get_element(x);
int& parent = element.m_parent;
if(parent != x)
{
parent = find_set(parent);
}
return parent;
}
/**
@brief Returns the current number of disjoint sets in the forest (i.e. the current number of trees).
@return As described
*/
int set_count() const
{
return m_setCount;
}
/**
@brief Merges the disjoint sets containing elements x and y.
If both elements are already in the same disjoint set, this is a no-op.
@param[in] x The first element
@param[in] y The second element
@pre
- Both x and y must be elements in the disjoint set forest
@throw Exception
- If the precondition is violated
*/
void union_sets(int x, int y)
{
int setX = find_set(x);
int setY = find_set(y);
if(setX != setY) link(setX, setY);
}
/**
@brief Returns the value associated with element x.
@param[in] x The element whose value to return
@pre
- x must be an element in the disjoint set forest
@throw Exception
- If the precondition is violated
@return As described
*/
T& value_of(int x)
{
return get_element(x).m_value;
}
/**
@brief Returns the value associated with element x.
@param[in] x The element whose value to return
@pre
- x must be an element in the disjoint set forest
@throw Exception
- If the precondition is violated
@return As described
*/
const T& value_of(int x) const
{
return get_element(x).m_value;
}
//#################### PRIVATE METHODS ####################
private:
Element& get_element(int x) const
{
typename std::map<int,Element>::iterator it = m_elements.find(x);
if(it != m_elements.end()) return it->second;
else throw Exception(OSSWrapper() << "No such element: " << x);
}
void link(int x, int y)
{
Element& elementX = get_element(x);
Element& elementY = get_element(y);
int& rankX = elementX.m_rank;
int& rankY = elementY.m_rank;
if(rankX > rankY)
{
elementY.m_parent = x;
}
else
{
elementX.m_parent = y;
if(rankX == rankY) ++rankY;
}
--m_setCount;
}
};
}
#endif
答案 1 :(得分:3)
答案 2 :(得分:3)
你的实现看起来不错(除了在函数合并中你应该声明返回void或者返回那里,我更喜欢返回void)。
问题是你需要跟踪Nodes*
。您可以通过在vector<DisjointSet::Node*>
课程上设置DisjointSet
或在其他地方设置vector
并将DisjointSet
的方法声明为static
来实现此目的。
以下是run的示例(请注意,我更改了merge以返回void并且未将DisjointSet
的方法更改为static
:
int main()
{
vector<DisjointSet::Node*> v(10);
DisjointSet ds;
for (int i = 0; i < 10; ++i) {
v[i] = new DisjointSet::Node();
v[i]->data = i;
}
int x, y;
while (cin >> x >> y) {
ds.merge(v[x], v[y]);
}
for (int i = 0; i < 10; ++i) {
cout << v[i]->data << ' ' << v[i]->parent->data << endl;
}
return 0;
}
使用此输入:
3 1
1 2
2 4
0 7
8 9
打印预期的:
0 7
1 1
2 1
3 1
4 1
5 5
6 6
7 7
8 9
9 9
你的森林是树木的组成:
7 1 5 6 9
/ / | \ |
0 2 3 4 8
所以你的算法很好,就我所知道而言具有Union-find的最佳复杂性,你跟踪Node
上的vector
。所以你可以简单地解除分配:
for (int i = 0; i < int(v.size()); ++i) {
delete v[i];
}
答案 3 :(得分:2)
我不能谈论算法,但对于内存管理,通常你会使用一种称为智能指针的东西,它会释放它所指向的内容。您可以获得共享所有权和单一所有权智能指针,也可以获得非所有权。正确使用这些将保证没有内存问题。
答案 4 :(得分:0)
您的实施很好。您现在需要做的就是保留一组不相交的节点,以便您可以在它们上调用union / find方法。
对于Kruskal算法,您需要一个数组,每个图顶点包含一个不相交的节点。然后,当您查找要添加到子图的下一条边时,您将使用find方法检查这些节点是否都已存在于子图中。如果是,那么你可以继续前进到下一个边缘。否则,是时候将该边添加到子图中,并在由该边连接的两个顶点之间执行并集操作。
答案 5 :(得分:0)
这篇博客文章展示了使用路径压缩的C ++实现: http://toughprogramming.blogspot.com/2013/04/implementing-disjoint-sets-in-c.html
答案 6 :(得分:0)
看一下这段代码
class Node {
int id,rank,data;
Node *parent;
public :
Node(int id,int data) {
this->id = id;
this->data = data;
this->rank =0;
this->parent = this;
}
friend class DisjointSet;
};
class DisjointSet {
unordered_map<int,Node*> forest;
Node *find_set_helper(Node *aNode) {
if( aNode->parent != aNode)
aNode->parent = find_set_helper(aNode->parent);
return aNode->parent;
}
void link(Node *xNode,Node *yNode) {
if( xNode->rank > yNode->rank)
yNode->parent = xNode;
else if(xNode-> rank < yNode->rank)
xNode->parent = yNode;
else {
xNode->parent = yNode;
yNode->rank++;
}
}
public:
DisjointSet() {
}
void make_set(int id,int data) {
Node *aNode = new Node(id,data);
this->forest.insert(make_pair(id,aNode));
}
void Union(int xId, int yId) {
Node *xNode = find_set(xId);
Node *yNode = find_set(yId);
if(xNode && yNode)
link(xNode,yNode);
}
Node* find_set(int id) {
unordered_map<int,Node*> :: iterator itr = this->forest.find(id);
if(itr == this->forest.end())
return NULL;
return this->find_set_helper(itr->second);
}
void print() {
unordered_map<int,Node*>::iterator itr;
for(itr = forest.begin(); itr != forest.end(); itr++) {
cout<<"\nid : "<<itr->second->id<<" parent :"<<itr->second->parent->id;
}
}
~DisjointSet(){
unordered_map<int,Node*>::iterator itr;
for(itr = forest.begin(); itr != forest.end(); itr++) {
delete (itr->second);
}
}
};
答案 7 :(得分:0)
为了从头开始实现Disjoint Sets,我强烈建议您阅读 Mark A. Weiss 的 Data Structures & Algorithm Analysis in C++ 一书。
在第8章中,它从基本的find / union开始,然后逐渐按高度/深度/等级添加联合,并找到压缩。最后,它提供了Big-O分析。
相信我,它拥有你想要了解的关于Disjoint Sets及其C ++实现的所有内容。
答案 8 :(得分:0)
以下代码似乎很容易理解为通过路径压缩实现union-find disjoints集
$scope.myData
答案 9 :(得分:0)
如果你试图询问哪种风格更适合于不正确的不相交集(矢量或地图(rb树)),那么我可能需要添加一些内容
make_set (int key , node info )
:这通常是一个成员函数,用于(1)添加节点和(2)使节点指向自身(parent = key),这最终会创建一个不相交的集合。向量O(n)的操作时间复杂度,对于映射O(n * logn)。find_set( int key )
:这通常有两个功能,(1)通过给定的键(2)路径压缩找到节点。我无法真正计算路径压缩,但是为了简单地搜索节点,(1)向量O(1)和(2)映射O(log(n))的时间复杂度。最后,我想说,虽然在这里看,矢量实现看起来更好,两者的时间复杂度是O(M *α(n))≈O(M * 5)左右我读过。
PS。尽管我确信它是正确的,但要验证我写的是什么。