排序的枚举的N路交叉

时间:2009-12-14 06:19:09

标签: c# linq .net-3.5

给定 n 相同类型的枚举,以升序返回不同的元素,例如:

IEnumerable<char> s1 = "adhjlstxyz";
IEnumerable<char> s2 = "bdeijmnpsz";
IEnumerable<char> s3 = "dejlnopsvw";

我想有效地找到所有可用数据元素的值:

IEnumerable<char> sx = Intersect(new[] { s1, s2, s3 });

Debug.Assert(sx.SequenceEqual("djs"));

“有效”在这里意味着

  1. 输入枚举应该只列举一次,
  2. 只有在需要时才能检索输入枚举的元素,
  3. 算法不应递归枚举自己的输出。
  4. 我需要一些提示如何处理解决方案。


    到目前为止,这是我的(天真)尝试:

    static IEnumerable<T> Intersect<T>(IEnumerable<T>[] enums)
    {
        return enums[0].Intersect(
            enums.Length == 2 ? enums[1] : Intersect(enums.Skip(1).ToArray()));
    }
    

    Enumerable.Intersect将第一个可枚举数收集到HashSet中,然后枚举第二个可枚举数并生成所有匹配元素。 Intersect然后递归地将结果与下一个可枚举相交。 这显然不是很有效(它不符合约束条件)。并且它没有利用元素完全排序的事实。


    这是我尝试交叉两个枚举。也许它可以推广到 n 枚举?

    static IEnumerable<T> Intersect<T>(IEnumerable<T> first, IEnumerable<T> second)
    {
        using (var left = first.GetEnumerator())
        using (var right = second.GetEnumerator())
        {
            var leftHasNext = left.MoveNext();
            var rightHasNext = right.MoveNext();
    
            var comparer = Comparer<T>.Default;
    
            while (leftHasNext && rightHasNext)
            {
                switch (Math.Sign(comparer.Compare(left.Current, right.Current)))
                {
                case -1:
                    leftHasNext = left.MoveNext();
                    break;
                case 0:
                    yield return left.Current;
                    leftHasNext = left.MoveNext();
                    rightHasNext = right.MoveNext();
                    break;
                case 1:
                    rightHasNext = right.MoveNext();
                    break;
                }
            }
        }
    }
    

2 个答案:

答案 0 :(得分:4)

行;更复杂的答案:

public static IEnumerable<T> Intersect<T>(params IEnumerable<T>[] enums) {
    return Intersect<T>(null, enums);
}
public static IEnumerable<T> Intersect<T>(IComparer<T> comparer, params IEnumerable<T>[] enums) {
    if(enums == null) throw new ArgumentNullException("enums");
    if(enums.Length == 0) return Enumerable.Empty<T>();
    if(enums.Length == 1) return enums[0];
    if(comparer == null) comparer = Comparer<T>.Default;
    return IntersectImpl(comparer, enums);
}
public static IEnumerable<T> IntersectImpl<T>(IComparer<T> comparer, IEnumerable<T>[] enums) {
    IEnumerator<T>[] iters = new IEnumerator<T>[enums.Length];
    try {
        // create iterators and move as far as the first item
        for (int i = 0; i < enums.Length; i++) {
            if(!(iters[i] = enums[i].GetEnumerator()).MoveNext()) {
                yield break; // no data for one of the iterators
            }
        }
        bool first = true;
        T lastValue = default(T);
        do { // get the next item from the first sequence
            T value = iters[0].Current;
            if (!first && comparer.Compare(value, lastValue) == 0) continue; // dup in first source
            bool allTrue = true;
            for (int i = 1; i < iters.Length; i++) {
                var iter = iters[i];
                // if any sequence isn't there yet, progress it; if any sequence
                // ends, we're all done
                while (comparer.Compare(iter.Current, value) < 0) {
                    if (!iter.MoveNext()) goto alldone; // nasty, but
                }
                // if any sequence is now **past** value, then short-circuit
                if (comparer.Compare(iter.Current, value) > 0) {
                    allTrue = false;
                    break;
                }
            }
            // so all sequences have this value
            if (allTrue) yield return value;
            first = false;
            lastValue = value;
        } while (iters[0].MoveNext());
    alldone:
        ;
    } finally { // clean up all iterators
        for (int i = 0; i < iters.Length; i++) {
            if (iters[i] != null) {
                try { iters[i].Dispose(); }
                catch { }
            }
        }
    }
}

答案 1 :(得分:2)

您可以使用LINQ:

    public static IEnumerable<T> Intersect<T>(IEnumerable<IEnumerable<T>> enums) {
        using (var iter = enums.GetEnumerator()) {
            IEnumerable<T> result;
            if (iter.MoveNext()) {
                result = iter.Current;
                while (iter.MoveNext()) {
                    result = result.Intersect(iter.Current);
                }
            } else {
                result = Enumerable.Empty<T>();
            }
            return result;
        }
    }

这将是简单,尽管它确实多次构建哈希集;一次推进所有n(利用排序)会很难,但是你也可以构建一个哈希集并删除丢失的东西?