我刚刚对RedBlack Tree做了一些研究。我知道.Net 4.0中的SortedSet类使用RedBlack树。因此,我使用Reflector将该部分取出,并创建了一个RedBlackTree类。现在我在这个RedBlackTree和SortedSet上运行一些perf测试,插入40000个连续积分值(从0到39999开始),我惊讶地发现有很大的性能差异如下:
RBTree took 9.27208 sec to insert 40000 values
SortedSet took 0.0253097 sec to insert 40000 values
背后的原因是什么?顺便说一下,我只在Release配置中运行测试,这里是小测试代码
var stopWatch = new Stopwatch();
var rbT = new RedBlackTree<int>();
stopWatch = new Stopwatch();
stopWatch.Start();
for (int i = 0; i < 40000; i++) {
rbT.Add(i);
}
stopWatch.Stop();
Console.WriteLine(stopWatch.Elapsed);
var ss = new SortedSet<int>();
stopWatch = new Stopwatch();
stopWatch.Start();
for (int i = 0; i < 40000; i++) {
ss.Add(i);
}
stopWatch.Stop();
Console.WriteLine(stopWatch.Elapsed);
修改
我想为RBTree分享我提取的代码,以便您也可以运行诊断
public class Node<T>
{
public Node(){}
public Node(T value)
{
Item = value;
}
public Node(T value, bool isRed)
{
Item = value;
IsRed = isRed;
}
public T Item;
public Node<T> Left;
public Node<T> Right;
public Node<T> Parent;
public bool IsRed;
}
public class RedBlackTree<T>
{
public RedBlackTree(){}
public Node<T> root;
int count, version;
Comparer<T> comparer = Comparer<T>.Default;
public void Add(T item)
{
if (this.root == null)
{
this.root = new Node<T>(item, false);
this.count = 1;
this.version++;
return;
}
Node<T> root = this.root;
Node<T> node = null;
Node<T> grandParent = null;
Node<T> greatGrandParent = null;
this.version++;
int num = 0;
while (root != null)
{
num = this.comparer.Compare(item, root.Item);
if (num == 0)
{
this.root.IsRed = false;
return;
}
if (Is4Node(root))
{
Split4Node(root);
if (IsRed(node))
{
this.InsertionBalance(root, ref node, grandParent, greatGrandParent);
}
}
greatGrandParent = grandParent;
grandParent = node;
node = root;
root = (num < 0) ? root.Left : root.Right;
}
Node<T> current = new Node<T>(item);
if (num > 0)
{
node.Right = current;
}
else
{
node.Left = current;
}
if (node.IsRed)
{
this.InsertionBalance(current, ref node, grandParent, greatGrandParent);
}
this.root.IsRed = false;
this.count++;
}
private static bool IsRed(Node<T> node)
{
return ((node != null) && node.IsRed);
}
private static bool Is4Node(Node<T> node)
{
return (IsRed(node.Left) && IsRed(node.Right));
}
private static void Split4Node(Node<T> node)
{
node.IsRed = true;
node.Left.IsRed = false;
node.Right.IsRed = false;
}
private void InsertionBalance(Node<T> current, ref Node<T> parent, Node<T> grandParent, Node<T> greatGrandParent)
{
Node<T> node;
bool flag = grandParent.Right == parent;
bool flag2 = parent.Right == current;
if (flag == flag2)
{
node = flag2 ? RotateLeft(grandParent) : RotateRight(grandParent);
}
else
{
node = flag2 ? RotateLeftRight(grandParent) : RotateRightLeft(grandParent);
parent = greatGrandParent;
}
grandParent.IsRed = true;
node.IsRed = false;
ReplaceChildOfNodeOrRoot(greatGrandParent, grandParent, node);
}
private static Node<T> RotateLeft(Node<T> node)
{
Node<T> right = node.Right;
node.Right = right.Left;
right.Left = node;
return right;
}
private static Node<T> RotateRight(Node<T> node)
{
Node<T> left = node.Left;
node.Left = left.Right;
left.Right = node;
return left;
}
private static Node<T> RotateLeftRight(Node<T> node)
{
Node<T> left = node.Left;
Node<T> right = left.Right;
node.Left = right.Right;
right.Right = node;
left.Right = right.Left;
right.Left = left;
return right;
}
private static Node<T> RotateRightLeft(Node<T> node)
{
Node<T> right = node.Right;
Node<T> left = right.Left;
node.Right = left.Left;
left.Left = node;
right.Left = left.Right;
left.Right = right;
return left;
}
private void ReplaceChildOfNodeOrRoot(Node<T> parent, Node<T> child, Node<T> newChild)
{
if (parent != null)
{
if (parent.Left == child)
{
parent.Left = newChild;
}
else
{
parent.Right = newChild;
}
}
else
{
this.root = newChild;
}
}
}
修改
我在其他一些数据结构上运行相同的诊断(一些由我创建*,一些来自.net framework **),这里是有趣的结果
*AATree 00:00:00.0309294
*AVLTree 00:00:00.0129743
**SortedDictionary 00:00:00.0313571
*RBTree 00:00:09.2414156
**SortedSet 00:00:00.0241973
RBTree与上面相同(从SortedSet类中删除)。 我尝试了40万个值,但RBTree似乎正在取消,我真的不知道为什么。
答案 0 :(得分:17)
您的Node<T>
课程中存在错误。当您调用只接受单个值参数的构造函数时,您应该将IsRed
设置为true
。
我认为固定的Node<T>
类看起来像这样:
public sealed class Node<T>
{
public T Item { get; private set; }
public bool IsRed { get; set; }
public Node<T> Left { get; set; }
public Node<T> Right { get; set; }
public Node(T value)
{
Item = value;
IsRed = true;
}
public Node(T value, bool isRed)
{
Item = value;
IsRed = isRed;
}
}
另一个选项 - 我的偏好 - 将完全省略该构造函数,并且在实例化新节点时始终要求IsRed
显式设置:
public sealed class Node<T>
{
public T Item { get; private set; }
public bool IsRed { get; set; }
public Node<T> Left { get; set; }
public Node<T> Right { get; set; }
public Node(T value, bool isRed)
{
Item = value;
IsRed = isRed;
}
}
然后在Add
方法中替换此行...
Node<T> current = new Node<T>(item);
......有了......
Node<T> current = new Node<T>(item, true);
答案 1 :(得分:3)
答案 2 :(得分:1)
SortedSet包含TargetedPatchingOptOut属性,您复制的版本是否包含该属性?
[TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")]
public bool Add(T item)
{
return this.AddIfNotPresent(item);
}
答案 3 :(得分:0)
如果差异不是那么大,我会建议原因是.NET程序集是NGen-ed,因此它们已经被翻译成本机代码。对于您的类,将IL代码编译为本机代码的时间将在您的测试时间内分摊。如何增加循环迭代次数会影响时间?