带有where子句的LINQ语句使执行变慢

时间:2013-05-08 10:23:10

标签: c# linq

我对LINQ和where语句有疑问。我有以下代码示例(这是我在应用程序中使用的代码的简化版本):

// Get the images from a datasource.
var images = GetImages(); // returns IEnumerable<Image>

// I continue processing images until everything has been processed.
while (images.Any())
{
    // I'm going to determine what kind of image it is and do some actions with it.
    var image = images.First();

    // Suddenly in my process I'm going to add a where statement to my images collection to fetch all images that matches the specified criteria.
    // It can happen (if my images collection is not empty) that the same where statement will be executed again to the images collection.
    // This is also where the problem is, somehow when I don't add the ToList() extension method, my linq statement is becoming slow, really slow.
    // When I add the ToList() extension method, why is my linq statement running fast again?
    var saveImages = images.Where(<criteria>); //.ToList() this is needed to make my LINQ query performant again.

    // I'm going to do something with these save images and then I'm going to remove these save images from the current images collection because I do not need to do these anymore by using the following statement.
    images = images.Except(saveImages);
}

由于代码示例解释了为什么在添加ToList()扩展方法时我的LINQ语句再次变快。为什么我不能仅使用Where语句,因为它返回一个IEnumerable集合?

我真的很困惑,我希望有人可以向我解释:)。

2 个答案:

答案 0 :(得分:5)

当您完成循环时,images首先成为此

images.Except(firstSetOfExclusions)

然后这个

images.Except(firstSetOfExclusions).Except(secondSetOfExclusions)

然后这个

images.Except(firstSetOfExclusions).Except(secondSetOfExclusions).Except(thirdSetOfExclusions)

等等。缓慢的原因在于,除非您调用ToList,否则每组排除项都必须执行新查询。随着循环的每次迭代,这变得越来越慢,因为它一遍又一遍地执行相同的查询。 ToList通过在内存中“实现”查询来修复此问题。

请注意,此问题的另一个解决方案是“实现”新的图像子集,如下所示:

images = images.Except(saveImages).ToList();

这样可以避免将“除外”链接起来,因此您无需在ToList上拨打saveImages

答案 1 :(得分:3)

如果我们重新实现LINQ-to-Objects来显示方法,也许会更有意义;这是我们的Main

static void Main()
{
    Log();
    IEnumerable<int> data = GetData();

    while (data.Any())
    {
        var value = data.First();
        Console.WriteLine("\t\tFound:{0}", value);
        var found = data.Where(i => i == value);
        data = data.Except(found);
    }
}
static IEnumerable<int> GetData()
{
    Log();
    return new[] { 1, 2, 3, 4, 5 };
}

看起来很无辜,是吗?现在运行它记录输出(LINQ方法显示在底部) - 我们得到:

Main
GetData
Any
First
                Found:1
Any
Except
Where
First
Except
Where
                Found:2
Any
Except
Where
Except
Where
Except
Where
First
Except
Where
Except
Where
Except
Where
                Found:3
Any
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
First
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
                Found:4
Any
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
First
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
                Found:5
Any
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where

注意每个项目之间的复杂程度如何?

对于奖励积分,将GetData设为迭代器块 - 查看GetData被执行的次数?

static IEnumerable<int> GetData()
{
    Log();
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 4;
    yield return 5;
}

94次(而不是原始版本中的一次)。好玩,呵呵?

这不是LINQ的错 - 这是因为你正在使用LINQ非常奇怪。对于你正在做的事情,最好是处理一个扁平集合(List<T>),根据需要添加和删除项目。

这是LINQ:

public static bool Any<T>(this IEnumerable<T> data)
{
    Log();
    using (var iter = data.GetEnumerator())
    {
        return iter.MoveNext();
    }
}
static void Log([CallerMemberName] string name = null)
{
    Console.WriteLine(name);
}
public static T First<T>(this IEnumerable<T> data)
{
    Log();
    using (var iter = data.GetEnumerator())
    {
        if (iter.MoveNext()) return iter.Current;
        throw new InvalidOperationException();
    }
}
public static IEnumerable<T> Where<T>(this IEnumerable<T> data, Func<T,bool> predicate)
{
    Log();
    foreach (var item in data) if (predicate(item)) yield return item;
}
public static IEnumerable<T> Except<T>(this IEnumerable<T> data, IEnumerable<T> except)
{
    Log();
    var exclude = new HashSet<T>(except);
    foreach (var item in data)
    {
        if (!exclude.Contains(item)) yield return item;
    }
}