将LINQ查询的枚举限制为仅一个

时间:2019-04-08 03:14:42

标签: c# linq task-parallel-library

我有一个LINQ查询,它不应被多次枚举,并且我想避免错误地将其枚举两次。我可以使用任何扩展方法来确保自己免受此类错误的侵害吗?我正在考虑这样的事情:

var numbers = Enumerable.Range(1, 10).OnlyOnce();
Console.WriteLine(numbers.Count()); // shows 10
Console.WriteLine(numbers.Count()); // throws InvalidOperationException: The query cannot be enumerated more than once.

之所以需要此功能,是因为我有一个无数的任务,旨在逐步实例化并运行任务,而在控制下缓慢地枚举它。我已经犯了两次运行任务的错误,因为我忘记了它是一个不同的可枚举的,而不是 数组。

var tasks = Enumerable.Range(1, 10).Select(n => Task.Run(() => Console.WriteLine(n)));
Task.WaitAll(tasks.ToArray()); // Lets wait for the tasks to finish...
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id))); // Lets see the completed task IDs...
// Oups! A new set of tasks started running!

3 个答案:

答案 0 :(得分:2)

可枚举枚举,故事结束。您只需要致电ToListToArray

// this will enumerate and start the tasks
var tasks = Enumerable.Range(1, 10)
                      .Select(n => Task.Run(() => Console.WriteLine(n)))
                      .ToList();

// wait for them all to finish
Task.WaitAll(tasks.ToArray());
Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id)));

如果需要并行处理,请输入Hrm

Parallel.For(0, 100, index => Console.WriteLine(index) );

或者如果您使用的是异步等待模式

public static async Task DoWorkLoads(IEnumerable <Something> results)
{
   var options = new ExecutionDataflowBlockOptions
                     {
                        MaxDegreeOfParallelism = 50
                     };

   var block = new ActionBlock<Something>(MyMethodAsync, options);

   foreach (var result in results)
      block.Post(result);

   block.Complete();
   await block.Completion;

}

...

public async Task MyMethodAsync(Something result)
{       
   await SomethingAsync(result);
}

更新,由于您正在寻找一种控制最大并发程度的方法,因此可以使用此方法

public static async Task<IEnumerable<Task>> ExecuteInParallel<T>(this IEnumerable<T> collection,Func<T, Task> callback,int degreeOfParallelism)
{
   var queue = new ConcurrentQueue<T>(collection);

   var tasks = Enumerable.Range(0, degreeOfParallelism)
                         .Select(async _ =>
                          {
                             while (queue.TryDequeue(out var item))
                                await callback(item);
                          })
                         .ToArray();

   await Task.WhenAll(tasks);

   return tasks;
}

答案 1 :(得分:2)

  

我想避免两次错误地枚举它。

您可以用一个被枚举两次的集合抛出该集合。

例如:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApp8
{
    public static class EnumExtension
    {
        class OnceEnumerable<T> : IEnumerable<T>
        {
            IEnumerable<T> col;
            bool hasBeenEnumerated = false;
            public OnceEnumerable(IEnumerable<T> col)
            {
                this.col = col;
            }

            public IEnumerator<T> GetEnumerator()
            {
                if (hasBeenEnumerated)
                {
                    throw new InvalidOperationException("This collection has already been enumerated.");
                }
                this.hasBeenEnumerated = true;
                return col.GetEnumerator();
            }

            IEnumerator IEnumerable.GetEnumerator()
            {
                return GetEnumerator();
            }
        }

        public static IEnumerable<T> OnlyOnce<T>(this IEnumerable<T> col)
        {
            return new OnceEnumerable<T>(col);
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
             var col = Enumerable.Range(1, 10).OnlyOnce();

             var colCount = col.Count(); //first enumeration
             foreach (var c in col) //second enumeration
             {
                 Console.WriteLine(c);
             }
        }
    }
}

答案 2 :(得分:1)

Rx当然是控制并行性的一种选择。

var query =
    Observable
        .Range(1, 10)
        .Select(n => Observable.FromAsync(() => Task.Run(() => new { Id = n })));

var tasks = query.Merge(maxConcurrent: 3).ToArray().Wait();

Console.WriteLine(String.Join(", ", tasks.Select(t => t.Id)));