Question

我发现自己经常编写递归IEnumerable<T>迭代器来实现与XContainer.Descendants提供的相同的“后代”模式。我给出的模式如下，给定一个Foo类型，其中包含一个名为Children的单级迭代器：

public static IEnumerable<Foo> Descendants(this Foo root) {
    foreach (var child in root.Children()) {
        yield return child;
        foreach (var subchild in child.Descendants()) {
            yield return subchild;
        }
    }
}

This old StackOverflow question表明了相同的模式。但由于某些原因，我必须引用三个级别的层次结构（root，child和subchild），这让我觉得很奇怪。 这种基本的深度优先递归模式能否进一步降低？或者这是一种算法原语吗？

我能想到的最好的方法是将模式抽象为通用扩展。这不会减少上面给出的迭代器模式的逻辑，但它确实消除了为多个特定类定义Descendants方法的要求。在缺点方面，这为Object本身添加了一种扩展方法，这有点臭：

public static IEnumerable<T> SelectRecurse<T>(
    this T root, Func<T, IEnumerable<T>> enumerator) {

    foreach (T item in enumerator(root))
    {
        yield return item;
        foreach (T subitem in item.SelectRecurse(enumerator))
        {
            yield return subitem;
        }
    }
}

// Now we can just write:
foreach(var item in foo.SelectRecurse(f => f.Children())) { /* do stuff */ }

Answer 1

您可以使用显式堆栈，而不是隐式使用线程的调用堆栈来存储您正在使用的数据。这甚至可以推广到Traverse方法，该方法只接受委托代表“让我的孩子”来电：

public static IEnumerable<T> Traverse<T>(
    this IEnumerable<T> source
    , Func<T, IEnumerable<T>> childrenSelector)
{
    var stack = new Stack<T>(source);
    while (stack.Any())
    {
        var next = stack.Pop();
        yield return next;
        foreach (var child in childrenSelector(next))
            stack.Push(child);
    }
}

因为这不是递归的，因此不会不断创建状态机，所以它会表现得更好。

旁注，如果您想要呼吸优先搜索，请使用Queue代替Stack。如果您希望Best First Search使用优先级队列。

为了确保兄弟姐妹的返回顺序与从selecor的顺序返回的顺序相同，而不是相反，只需对Reverse的结果添加childrenSelector次调用。

Answer 2

我认为这是一个很好的问题。我为什么需要两个循环的最佳解释：我们需要认识到每个项目都被转换为多个项目（本身及其所有后代）的事实。这意味着我们不会一对一（如Select）而是一对多（SelectMany）进行映射。

我们可以这样写：

public static IEnumerable<Foo> Descendants(this IEnumerable<Foo> items) {
 foreach (var item in items) {
  yield return item;
  foreach (var subitem in item.Children().Descendants())
   yield return subitem;
 }
}

或者像这样：

public static IEnumerable<Foo> Descendants(Foo root) {
 var children = root.Children();
 var subchildren = children.SelectMany(c => c.Descendants());
 return children.Concat(subchildren);
}

或者像这样：

public static IEnumerable<Foo> Descendants(this IEnumerable<Foo> items) {
 var children = items.SelectMany(c => c.Descendants());
 return items.Concat(children);
}

必须在IEnumerable<Foo>上调用root.Children()版本。

我认为所有这些重写都暴露了一种不同的方式来看问题。另一方面，它们都有两个嵌套循环。循环可以隐藏在辅助函数中，但它们仍然存在。

Answer 3

我会用List来管理这个：

public static IEnumerable<Foo> Descendants(this Foo root) {
    List<Foo> todo = new List<Foo>();
    todo.AddRange(root.Children());
    while(todo.Count > 0)
    {
        var first = todo[0];
        todo.RemoveAt(0);
        todo.InsertRange(0,first.Children());
        yield return first;
    }
}

不是递归的，所以不应该吹掉堆栈。您只需在列表的前面添加更多自己的工作，这样就可以实现深度优先遍历。

Answer 4

Damien_the_Unbeliever和Servy都提供了一种算法版本，可以避免使用一种或另一种类型的集合创建递归调用堆栈。 Damien使用List可能导致在列表头部插入的性能不佳，而Servy使用堆栈将导致嵌套元素以相反的顺序返回。我相信手动实现单向链表将保持Servy的性能，同时仍然返回原始顺序中的所有项目。唯一棘手的部分是通过迭代根来初始化第一个ForwardLink。为了保持Traverse干净，我将其移至ForwardLink上的构造函数。

public static IEnumerable<T> Traverse<T>(
    this T root, 
    Func<T, IEnumerable<T>> childrenSelector) {

    var head = new ForwardLink<T>(childrenSelector(root));

    if (head.Value == null) yield break; // No items from root iterator

    while (head != null)
    {
        var headValue = head.Value;
        var localTail = head;
        var second = head.Next;

        // Insert new elements immediately behind head.
        foreach (var child in childrenSelector(headValue))
            localTail = localTail.Append(child);

        // Splice on the old tail, if there was one
        if (second != null) localTail.Next = second;

        // Pop the head
        yield return headValue;
        head = head.Next; 
    }
}

public class ForwardLink<T> {
    public T Value { get; private set; }
    public ForwardLink<T> Next { get; set; }

    public ForwardLink(T value) { Value = value; }

    public ForwardLink(IEnumerable<T> values) { 
        bool firstElement = true;
        ForwardLink<T> tail = null;
        foreach (T item in values)
        {
            if (firstElement)
            {
                Value = item;
                firstElement = false;
                tail = this;
            }
            else
            {
                tail = tail.Append(item);
            }
        }
    }

    public ForwardLink<T> Append(T value) {
        return Next = new ForwardLink<T>(value);
    } 
}

Answer 5

我提出了一个不同的版本，没有使用yield：

    public abstract class RecursiveEnumerator : IEnumerator {
        public RecursiveEnumerator(ICollection collection) {
            this.collection = collection;
            this.enumerator = collection.GetEnumerator();
        }

        protected abstract ICollection GetChildCollection(object item);

        public bool MoveNext() {
            if (enumerator.Current != null) {
                ICollection child_collection = GetChildCollection(enumerator.Current);
                if (child_collection != null && child_collection.Count > 0) {
                    stack.Push(enumerator);
                    enumerator = child_collection.GetEnumerator();
                }
            }
            while (!enumerator.MoveNext()) {
                if (stack.Count == 0) return false;
                enumerator = stack.Pop();
            }
            return true;
        }

        public virtual void Dispose() { }

        public object Current { get { return enumerator.Current; } }

        public void Reset() {
            stack.Clear();
            enumerator = collection.GetEnumerator();
        }

        private IEnumerator enumerator;
        private Stack<IEnumerator> stack = new Stack<IEnumerator>();
        private ICollection collection;
    }

用法示例

    public class RecursiveControlEnumerator : RecursiveEnumerator, IEnumerator {
        public RecursiveControlEnumerator(Control.ControlCollection controlCollection)
            : base(controlCollection) { }

        protected override ICollection GetChildCollection(object c) {
            return (c as Control).Controls;
        }
    }

Answer 6

要扩展我的评论，这应该有效：

public static IEnumerable<Foo> Descendants(this Foo node)
{
    yield return node; // return branch nodes
    foreach (var child in node.Children())
        foreach (var c2 in child.Descendants())
            yield return c2; // return leaf nodes
}

应返回所有分支节点和叶节点。如果您只想返回叶节点，请删除第一个yield return。

在回答你的问题时，是的，它是一个算法原语，因为你肯定需要调用node.Children（），你肯定需要在每个子节点上调用child.Descendants（）。我同意有两个“foreach”循环似乎很奇怪，但第二个实际上只是继续整个枚举，而不是迭代孩子。

Answer 7

试试这个：

private static IEnumerable<T> Descendants<T>(
    this IEnumerable<T> children, Func<T, IEnumerable<T>> enumerator)
{
    Func<T, IEnumerable<T>> getDescendants =
        child => enumerator(child).Descendants(enumerator);

    Func<T, IEnumerable<T>> getChildWithDescendants =
        child => new[] { child }.Concat(getDescendants(child));

    return children.SelectMany(getChildWithDescendants);
}

或者如果您更喜欢非Linq变体：

private static IEnumerable<T> Descendants<T>(
    this IEnumerable<T> children, Func<T, IEnumerable<T>> enumerator)
{
    foreach (var child in children)
    {
        yield return child;

        var descendants = enumerator(child).Descendants(enumerator);

        foreach (var descendant in descendants)
        {
            yield return descendant;
        }
    }
}

并称之为：

root.Children().Descendants(f => f.Children())

递归枚举的逻辑减少

7 个答案: