Parallel.Foreach循环永远不会结束

时间:2014-07-09 15:40:41

标签: c# .net parallel-processing parallel.foreach

我的代码按照我期望的更小的循环执行,但是当我迭代更大的IP范围时,过程似乎停止并且永远不会完成。

程序继续运行,不会抛出任何异常。

public void Discover()
{
    Int32 MaxThreadCount     = 20, 
        DevicesProcessed     = 0;
    List<String> IPAddresses = GetIPAddresses(IPRangesToCheck).ToList();

    using (cts = new CancellationTokenSource())
    {

        ParallelOptions po = new ParallelOptions();
        po.MaxDegreeOfParallelism = MaxThreadCount;
        po.CancellationToken = cts.Token;

        try
        {
            var deviceInfo = new DeviceInfo();

            Parallel.ForEach(IPAddresses, po, (item, loopState) =>
            {
                try
                {
                    Console.WriteLine("DevicesProcessed: {0}, IPAddresses.Count: {1}", DevicesProcessed, IPAddresses.Count.ToString());
                    if (DevicesProcessed >= IPAddresses.Count)
                    {
                        Console.WriteLine("!!!THRESHOLD REACHED!!!");
                        cts.Cancel();
                    }

                    if (loopState.ShouldExitCurrentIteration || loopState.IsExceptional)
                    {
                        loopState.Stop();
                        DiscoverStatus = Devices.DiscoverState.Stopped;
                    }

                    var response = CheckIPAndReturnInfo(item);

                    if (loopState.ShouldExitCurrentIteration || loopState.IsExceptional)
                    {
                        loopState.Stop();
                        DiscoverStatus = Devices.DiscoverState.Stopped;
                    }

                    if (response != null)
                    {
                        FoundDevices.Add(response);
                        Console.WriteLine(
                            "Device Count {0}, IP Address {1}",
                            DevicesProcessed,
                            item);
                    }

                    Interlocked.Increment(ref DevicesProcessed);
                }
                catch (Exception ex)
                {
                    Console.Write("Error! = " + ex.Message);

                    if (deviceInfo.State == CommunicationState.Faulted)
                    {
                        loopState.Stop();
                    }
                }
            });

            RaiseAllItemsCompleteEvent();

        }
        catch (AggregateException aggEx)
        {
            Console.WriteLine("Parrallel Exception!: " + aggEx);
        }
    }
}

当我使用172.16.0.0到172.16.255.255的范围时,需要检查65536个IP地址。程序在停止处理任何更多地址之前达到65519,而不是在代码中进一步前进。 此时,代码中的任何位置都不会出现断点。

if (DevicesProcessed >= IPAddresses.Count)部分尝试在完成后手动取消循环,但不幸的是它没有那么远。取消,在此之前可以正常工作。

我一直无法找到其他人遇到平行任务的这类问题而且我很难过。关于如何进一步调试该问题的任何建议将不胜感激。

我在我的CheckIPAndReturnInfo课程中添加了一个秒表,看看它需要多长时间。这是生成Start:End:输出文本的位置。以下是程序结果的输出示例:

DevicesProcessed: 65515, IPAddresses.Count: 65536
Start : 172.16.114.94
End : 172.16.114.94 Elaspsed: 00:00:01.4247110
A first chance exception of type 'System.Net.Sockets.SocketException' occurred in System.dll
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
End : 172.16.114.60 Elaspsed: 00:00:07.0803977
DevicesProcessed: 65517, IPAddresses.Count: 65536
Start : 172.16.114.61
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
End : 172.16.114.76 Elaspsed: 00:00:07.0807436
DevicesProcessed: 65518, IPAddresses.Count: 65536
Start : 172.16.114.77
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
The thread '<No Name>' (0x1664) has exited with code 0 (0x0).
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
End : 172.16.114.61 Elaspsed: 00:00:07.0806001
DevicesProcessed: 65519, IPAddresses.Count: 65536
Start : 172.16.114.62
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
End : 172.16.114.77 Elaspsed: 00:00:07.0807928
DevicesProcessed: 65520, IPAddresses.Count: 65536
Start : 172.16.114.78
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
The thread '<No Name>' (0xf90) has exited with code 0 (0x0).
The thread '<No Name>' (0x1b24) has exited with code 0 (0x0).
A first chance exception of type 'System.Net.Sockets.SocketException' occurred in System.dll
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
End : 172.16.114.62 Elaspsed: 00:00:07.0806679
A first chance exception of type 'Lextm.SharpSnmpLib.Messaging.TimeoutException' occurred in SharpSnmpLib.dll
End : 172.16.114.78 Elaspsed: 00:00:07.0804395
The thread '<No Name>' (0x1518) has exited with code 0 (0x0).
The thread '<No Name>' (0x1e1c) has exited with code 0 (0x0).
The thread '<No Name>' (0x1190) has exited with code 0 (0x0).
The thread '<No Name>' (0x1218) has exited with code 0 (0x0).
The thread '<No Name>' (0x1c68) has exited with code 0 (0x0).
The thread '<No Name>' (0x1b20) has exited with code 0 (0x0).
The thread '<No Name>' (0xa14) has exited with code 0 (0x0).
The thread '<No Name>' (0x185c) has exited with code 0 (0x0).
The thread '<No Name>' (0x430) has exited with code 0 (0x0).

2 个答案:

答案 0 :(得分:2)

  1. 请勿在if语句中使用IsCompleted。只需将RaiseAllItemsCompleteEvent放在Parallel.ForEach之后。

  2. 在您的异常处理程序中,不要重新抛出异常,因为它会导致Parallel错误输出。只需使用loopState.Stop

  3. 愚蠢的例子:

    Parallel.ForEach(Enumerable.Range(0, 100),
        (i, state) =>
        {
            Console.WriteLine(i);
            if(i==11) state.Stop();
        });
    RaiseAllItemsCompleteEvent();
    

    在条件检查中,只需使用if(loopState.IsStopped)即可。鉴于您的try/catch阻止,您很难点击IsExceptional

    最后,因为您在并行运行,所以必须使用Interlocked.Increment(ref DevicesProcessed)。否则,计数将不准确。

答案 1 :(得分:0)

cts.Cancel()可以抛出AggregateException

抓住它的内部尝试,看看它是否在那里命中,如果在你当前的代码缺少内部`AggregateException&#39;它将无限循环。