Question

我认为这需要O(A x B)时间来执行。

（其中A是collectionA的大小，B是collectionB的大小）

我说错了吗？

IEnumerable<A> GetMatches(IEnumerable<A> collectionA, IEnumerable<B> collectionB)
{
    foreach (A a in collectionA)
        foreach (B b in collectionB)
            if (a.Value == b.Value)
                yield return a;
}

有更快的方法来执行此查询吗？（也许使用LINQ？）

Answer 1

不幸的是，

Enumerable.Intersect不会在您与两种不同类型（A和B）进行比较时发挥作用。

这将需要单独处理以获得可行的Intersect调用。

您可以分阶段执行此操作：

IEnumerable<A> GetMatches(IEnumerable<A> collectionA, IEnumerable<B> collectionB)
     where A : ISomeConstraintWithValueProperty
     where B : ISomeOtherConstraintWithSameValueProperty
{
    // Get distinct values in A
    var values = new HashSet<TypeOfValue>(collectionB.Select(b => b.Value));

    return collectionA.Where(a => values.Contains(a.Value));
}

请注意，如果collectionB包含重复项（但不包含collectionA），则会返回重复项，因此它的结果与循环代码略有不同。

如果您想要唯一匹配（只返回一个），您可以将最后一行更改为：

return collectionA.Where(a => values.Contains(a.Value)).Distinct();

Answer 2

如果您的数据已排序，您可以尝试以下交叉算法，其复杂度为O（m + n），否则为O（nlogn），而不消耗额外的内存：

    private static IEnumerable<A> Intersect(A[] alist, B[] blist)
    {
        Array.Sort(alist);
        Array.Sort(blist);

        for (int i = 0, j = 0; i < alist.Length && j < blist.Length;)
        {
            if (alist[i].Value == blist[j].Value)
            {
                yield return alist[i];
                i++;
                j++;
            }
            else
            {
                if (alist[i].Value < blist[j].Value)
                {
                    i++;
                }
                else
                {
                    j++;
                }
            }
        }
    }

如何在不同类型的集合之间进行匹配？

2 个答案: