限制C#中的并行线程数

时间:2012-01-13 16:32:56

标签: c# c#-4.0 parallel-processing

我正在编写一个C#程序,通过FTP生成并上传50万个文件。我想并行处理4个文件,因为机器有4个核心,文件生成需要更长的时间。是否可以将以下Powershell示例转换为C#?或者是否有更好的框架,如C#中的Actor框架(如F#MailboxProcessor)?

Powershell example

$maxConcurrentJobs = 3;

# Read the input and queue it up
$jobInput = get-content .\input.txt
$queue = [System.Collections.Queue]::Synchronized( (New-Object System.Collections.Queue) )
foreach($item in $jobInput)
{
    $queue.Enqueue($item)
}

# Function that pops input off the queue and starts a job with it
function RunJobFromQueue
{
    if( $queue.Count -gt 0)
    {
        $j = Start-Job -ScriptBlock {param($x); Get-WinEvent -LogName $x} -ArgumentList $queue.Dequeue()
        Register-ObjectEvent -InputObject $j -EventName StateChanged -Action { RunJobFromQueue; Unregister-Event $eventsubscriber.SourceIdentifier; Remove-Job $eventsubscriber.SourceIdentifier } | Out-Null
    }
}

# Start up to the max number of concurrent jobs
# Each job will take care of running the rest
for( $i = 0; $i -lt $maxConcurrentJobs; $i++ )
{
    RunJobFromQueue
}

更新
与远程FTP服务器的连接可能很慢,所以我想限制FTP上传处理。

5 个答案:

答案 0 :(得分:31)

假设您使用TPL进行构建,可以将ParallelOptions.MaxDegreesOfParallelism设置为您想要的任何内容。

Parallel.For代码示例。

答案 1 :(得分:17)

任务并行库是你的朋友。请参阅this链接,其中介绍了您可以使用的内容。基本上框架4附带它,它将这些基本上后台线程池化的线程优化为正在运行的机器上的处理器数量。

也许有些事情如下:

ParallelOptions options = new ParallelOptions();

options.MaxDegreeOfParallelism = 4;

然后在你的循环中:

Parallel.Invoke(options,
 () => new WebClient().Upload("http://www.linqpad.net", "lp.html"),
 () => new WebClient().Upload("http://www.jaoo.dk", "jaoo.html"));

答案 2 :(得分:5)

如果您使用的是.Net 4.0,则可以使用Parallel library

假设您正在通过使用Parallel Foreach for instance对可以“并行”迭代的50万个文件进行迭代,或者您可以have a look to PLinq 这是comparison between the two

答案 3 :(得分:2)

基本上,您将要为要上载的每个文件创建一个Action或Task,将它们放入List中,然后处理该列表,从而限制可以并行处理的数量。

My blog post显示了如何使用“任务”和“操作”执行此操作,并提供了一个示例项目,您可以下载并运行该项目以查看两者的实际操作。

使用操作

如果使用Actions,则可以使用内置的.Net Parallel.Invoke函数。在这里,我们将其限制为最多并行运行4个线程。

var listOfActions = new List<Action>();
foreach (var file in files)
{
    var localFile = file;
    // Note that we create the Task here, but do not start it.
    listOfTasks.Add(new Task(() => UploadFile(localFile)));
}

var options = new ParallelOptions {MaxDegreeOfParallelism = 4};
Parallel.Invoke(options, listOfActions.ToArray());

这个选项不支持async,我假设你的FileUpload函数是,所以你可能想使用下面的Task示例。

使用任务

使用任务时,没有内置功能。但是,您可以使用我在博客上提供的那个。

    /// <summary>
    /// Starts the given tasks and waits for them to complete. This will run, at most, the specified number of tasks in parallel.
    /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
    /// </summary>
    /// <param name="tasksToRun">The tasks to run.</param>
    /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
    /// <param name="cancellationToken">The cancellation token.</param>
    public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, CancellationToken cancellationToken = new CancellationToken())
    {
        await StartAndWaitAllThrottledAsync(tasksToRun, maxTasksToRunInParallel, -1, cancellationToken);
    }

    /// <summary>
    /// Starts the given tasks and waits for them to complete. This will run the specified number of tasks in parallel.
    /// <para>NOTE: If a timeout is reached before the Task completes, another Task may be started, potentially running more than the specified maximum allowed.</para>
    /// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
    /// </summary>
    /// <param name="tasksToRun">The tasks to run.</param>
    /// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
    /// <param name="timeoutInMilliseconds">The maximum milliseconds we should allow the max tasks to run in parallel before allowing another task to start. Specify -1 to wait indefinitely.</param>
    /// <param name="cancellationToken">The cancellation token.</param>
    public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, int timeoutInMilliseconds, CancellationToken cancellationToken = new CancellationToken())
    {
        // Convert to a list of tasks so that we don't enumerate over it multiple times needlessly.
        var tasks = tasksToRun.ToList();

        using (var throttler = new SemaphoreSlim(maxTasksToRunInParallel))
        {
            var postTaskTasks = new List<Task>();

            // Have each task notify the throttler when it completes so that it decrements the number of tasks currently running.
            tasks.ForEach(t => postTaskTasks.Add(t.ContinueWith(tsk => throttler.Release())));

            // Start running each task.
            foreach (var task in tasks)
            {
                // Increment the number of tasks currently running and wait if too many are running.
                await throttler.WaitAsync(timeoutInMilliseconds, cancellationToken);

                cancellationToken.ThrowIfCancellationRequested();
                task.Start();
            }

            // Wait for all of the provided tasks to complete.
            // We wait on the list of "post" tasks instead of the original tasks, otherwise there is a potential race condition where the throttler's using block is exited before some Tasks have had their "post" action completed, which references the throttler, resulting in an exception due to accessing a disposed object.
            await Task.WhenAll(postTaskTasks.ToArray());
        }
    }

然后创建任务列表并调用函数让它们运行,一次最多同时执行4个,你可以这样做:

var listOfTasks = new List<Task>();
foreach (var file in files)
{
    var localFile = file;
    // Note that we create the Task here, but do not start it.
    listOfTasks.Add(new Task(async () => await UploadFile(localFile)));
}
await Tasks.StartAndWaitAllThrottledAsync(listOfTasks, 4);

此外,因为此方法支持异步,所以它不会阻止UI线程,就像使用Parallel.Invoke或Parallel.ForEach一样。

答案 4 :(得分:0)

我编写了以下技术,我使用BlockingCollection作为线程计数管理器。实现和处理工作非常简单。 它只接受Task对象并向阻塞列表添加一个整数值,将运行的线程数增加1.当线程完成时,它会使对象出列并在即将执行的任务的add操作上释放块。

        public class BlockingTaskQueue
        {
            private BlockingCollection<int> threadManager { get; set; } = null;
            public bool IsWorking
            {
                get
                {
                    return threadManager.Count > 0 ? true : false;
                }
            }

            public BlockingTaskQueue(int maxThread)
            {
                threadManager = new BlockingCollection<int>(maxThread);
            }

            public async Task AddTask(Task task)
            {
                Task.Run(() =>
                {
                    Run(task);
                });
            }

            private bool Run(Task task)
            {
                try
                {
                    threadManager.Add(1);
                    task.Start();
                    task.Wait();
                    return true;

                }
                catch (Exception ex)
                {
                    return false;
                }
                finally
                {
                    threadManager.Take();
                }

            }

        }