List的并行处理

时间:2014-10-08 13:00:31

标签: c# parallel-processing .net-3.5

我的情景: 我需要处理一个元素列表。每个元素处理都非常耗时(1-10秒) 而不是

List retval = new List();
foreach (item in myList)
    retval.Add(ProcessItem(item));
return retval;

我想并行处理每个项目。

我知道.NET有很多并行处理方法:最好的方法是什么? (注意,我坚持使用3.5框架版本,不能使用任务,异步和所有带有.Net 4的南希功能......)

这里我尝试使用委托:

private void DoTest(int processingTaskDuration)
{
    List<int> itemsToProcess = new List<int>();
    for (int i = 1; i <= 20; i++)
        itemsToProcess.Add(i);

    TestClass tc = new TestClass(processingTaskDuration);

    DateTime start = DateTime.Now;
    List<int> result = tc.ProcessList(itemsToProcess);
    TimeSpan elapsed = DateTime.Now - start;
    System.Diagnostics.Debug.WriteLine(string.Format("elapsed (msec)= {0}", (int)elapsed.TotalMilliseconds));
}

public class TestClass
{
    static int s_Counter = 0;
    static object s_lockObject = new Object();

    int m_TaskMsecDuration = 0;
    public TestClass() :
        this(5000)
    {
    }

    public TestClass(int taskMsecDuration)
    {
        m_TaskMsecDuration = taskMsecDuration;
    }


    public int LongOperation(int itemToProcess)
    {
        int currentCounter = 0;
        lock (s_lockObject)
        {
            s_Counter++;
            currentCounter = s_Counter;
        }

        System.Diagnostics.Debug.WriteLine(string.Format("LongOperation\tStart\t{0}\t{1}\t{2}", currentCounter, System.Threading.Thread.CurrentThread.ManagedThreadId, DateTime.Now.ToString("HH:mm:ss.ffffff")));

        // time consuming task, e.g 5 seconds
        Thread.Sleep(m_TaskMsecDuration);
        int retval = itemToProcess * 2;

        System.Diagnostics.Debug.WriteLine(string.Format("LongOperation\tEnd  \t{0}\t{1}\t{2}", currentCounter, System.Threading.Thread.CurrentThread.ManagedThreadId, DateTime.Now.ToString("HH:mm:ss.ffffff")));
        return retval;
    }

    delegate int LongOperationDelegate(int itemToProcess);
    public List<int> ProcessList(List<int> itemsToProcess)
    {
        List<IAsyncResult> asyncResults = new List<IAsyncResult>();
        LongOperationDelegate del = LongOperation;

        foreach (int item in itemsToProcess)
        {
            IAsyncResult res = del.BeginInvoke(item, null, null);
            asyncResults.Add(res);
        }

        // list of waitHandles to wait for
        List<WaitHandle> waitHandles = new List<WaitHandle>();
        asyncResults.ForEach(el => waitHandles.Add(el.AsyncWaitHandle));


        // wait for processing every item
        WaitHandle.WaitAll(waitHandles.ToArray());


        // retrieve result of processing
        List<int> retval = new List<int>();
        asyncResults.ForEach(res =>
        {
            int singleProcessingResult = del.EndInvoke(res);
            retval.Add(singleProcessingResult);
        }
        );
        return retval;
    }
}

那就是一些输出(第3列是渐进式计数器,使用它来匹配开始和呼叫结束,#4是threadID,last是timeStamp)

LongOperation   Start   1   6   15:11:18.331619
LongOperation   Start   2   12  15:11:18.331619
LongOperation   Start   3   13  15:11:19.363722
LongOperation   Start   4   14  15:11:19.895775
LongOperation   Start   5   15  15:11:20.406826
LongOperation   Start   6   16  15:11:21.407926
LongOperation   Start   7   17  15:11:22.410026
LongOperation   End     1   6   15:11:23.360121
LongOperation   End     2   12  15:11:23.361122
LongOperation   Start   8   12  15:11:23.363122
LongOperation   Start   9   6   15:11:23.365122
LongOperation   Start   10  18  15:11:23.907176
LongOperation   End     3   13  15:11:24.365222
LongOperation   Start   11  13  15:11:24.366222
LongOperation   End     4   14  15:11:24.897275
LongOperation   Start   12  14  15:11:24.898275
LongOperation   Start   13  19  15:11:25.407326
LongOperation   End     5   15  15:11:25.408326
LongOperation   Start   14  15  15:11:25.412327
LongOperation   Start   15  20  15:11:26.407426
LongOperation   End     6   16  15:11:26.410426
LongOperation   Start   16  16  15:11:26.410426
LongOperation   Start   17  21  15:11:27.408526
LongOperation   End     7   17  15:11:27.411527
LongOperation   Start   18  17  15:11:27.413527
LongOperation   End     8   12  15:11:28.365622
LongOperation   Start   19  12  15:11:28.366622
LongOperation   End     9   6   15:11:28.366622
LongOperation   Start   20  6   15:11:28.389624
LongOperation   End     10  18  15:11:28.908676
LongOperation   End     11  13  15:11:29.367722
LongOperation   End     12  14  15:11:29.899775
LongOperation   End     13  19  15:11:30.411827
LongOperation   End     14  15  15:11:30.413827
LongOperation   End     15  20  15:11:31.407926
LongOperation   End     16  16  15:11:31.411927
LongOperation   End     17  21  15:11:32.413027
LongOperation   End     18  17  15:11:32.416027
LongOperation   End     19  12  15:11:33.389124
LongOperation   End     20  6   15:11:33.391124
elapsed (msec)= 15075

所以:

代表是否接近合适的人?

我是否正确实施了它?

如果是这样,为什么第3次操作在前两次之后开始一秒钟(依此类推)?

我的意思是,我希望整个处理或多或少完成一次处理的时间,但似乎系统以奇怪的方式使用线程池。毕竟,我问了20个线程,它在前两个调用之后等待跨越第3个线程。

3 个答案:

答案 0 :(得分:2)

我认为the 3.5 backport of Reactive Extensions附带了Parallel.ForEach()的实现,您应该可以使用它。该端口应仅包含 使Rx在3.5上工作所需的内容,但这应该足够了。

其他have tried实施as well,基本上只是在ThreadPool排队工作项。

答案 1 :(得分:0)

void Main()
{
    var list = new List<int>{ 1,2,3 };
    var processes = list.Count();
    foreach (var item in list)
    {
        ThreadPool.QueueUserWorkItem(s => {
            ProcessItem(item);      
            processes--;
        });
    }
    while (processes > 0) { Thread.Sleep(10); }
}

static void ProcessItem(int item)
{
    Thread.Sleep(100); // do work
}

答案 2 :(得分:0)

我摆脱了第三个问题:

  

如果是这样,为什么第3次操作在前两次之后开始一秒钟   (依此类推)?

问题似乎是ThreadPool管理线程产生的默认方式:请参阅http://msdn.microsoft.com/en-us/library/0ka9477y%28v=VS.90%29.aspx。引用:

  

线程池有内置延迟(在.NET中半秒)   框架版本2.0)在启动新的空闲线程之前。如果你的   应用程序会在短时间内定期启动许多任务   空闲线程数量的增加可以产生显着的   吞吐量增加。设置空闲线程数太高   不必要地消耗系统资源。

似乎调用具有适当值的ThreadPool.SetMinThreads有很多帮助。 在我的ProcessList的开头,我插入了对此方法的调用:

private void SetUpThreadPool(int numThreadDesired)
{
    int currentWorkerThreads;
    int currentCompletionPortThreads;
    ThreadPool.GetMinThreads(out currentWorkerThreads, out currentCompletionPortThreads);
    //System.Diagnostics.Debug.WriteLine(string.Format("ThreadPool.GetMinThreads: workerThreads = {0}, completionPortThreads = {1}", workerThreads, completionPortThreads));

    const int MAXIMUM_VALUE_FOR_SET_MIN_THREAD_PARAM = 20;
    int numMinThreadToSet = Math.Min(numThreadDesired, MAXIMUM_VALUE_FOR_SET_MIN_THREAD_PARAM);
    if (currentWorkerThreads < numMinThreadToSet)
        ThreadPool.SetMinThreads(numThreadDesired, currentCompletionPortThreads);
}

public List<int> ProcessList(List<int> itemsToProcess)
{
    SetUpThreadPool(documentNumberList.Count);
    ...
}

现在所有线程(最多20个)都在同一时刻开始,没有延迟。我认为20对于MAXIMUM_VALUE_FOR_SET_MIN_THREAD_PARAM是一个很好的折衷方案:不太高,并且符合我的特殊要求

仍在疑惑主要问题

  
    
        
  1. 代表是否接近合适的人?
  2.     
  3. 我是否正确实施了?
  4.        

感谢大家的帮助。