Parallel.ForEach在3个收益率回报后挂起?

时间:2018-12-17 20:41:13

标签: c# parallel.foreach

我遇到一个问题,有时不调用Parallel.ForEach中的动作。我创建了一个简单的玩具程序,该程序在单线程情况下显示了该问题:

class Program
{
    static void Main(string[] args) 
    {
        // Any value > 3 here causes Parallel.ForEach to hang on the yield return
        int workCount = 4;
        bool inProcess = false;

        System.Collections.Generic.IEnumerable<int> getWorkItems()
        {
            while (workCount > 0)
            {
                if (!inProcess)
                {
                    inProcess = true;
                    System.Console.WriteLine($"    Returning work: {workCount}");
                    yield return workCount;
                }
            }
        }

        System.Threading.Tasks.Parallel.ForEach(getWorkItems(),
            new System.Threading.Tasks.ParallelOptions { MaxDegreeOfParallelism = 1 },
            (workItem) =>
            {
                System.Console.WriteLine($"      Parallel start:  {workItem}");
                workCount--;
                System.Console.WriteLine($"      Parallel finish: {workItem}");
                inProcess = false;
            });

        System.Console.WriteLine($"=================== Finished ===================\r\n");
    }
}

该程序的输出为:

Returning work: 4
  Parallel start:  4
  Parallel finish: 4
Returning work: 3
  Parallel start:  3
  Parallel finish: 3
Returning work: 2
  Parallel start:  2
  Parallel finish: 2
Returning work: 1

...,它就挂在那里。从未要求采取该行动。1.这是怎么回事?

---------------------------编辑:更详细的示例--------------- --------

这里是同一程序,具有更详细的输出以及一些保护共享值的锁:

static object lockOnMe = new object();
static void Run()
{
    System.Console.WriteLine($"Starting ThreadId: {Thread.CurrentThread.ManagedThreadId}");

    // Any value > 3 here causes Parallel.ForEach to hang on the yield return
    int workCount = 40;
    bool inProcess = false;

    System.Collections.Generic.IEnumerable<int> getWorkItems()
    {
        while (workCount > 0)
        {
            lock(lockOnMe)
            {
                if (!inProcess)
                {
                    inProcess = true;
                    System.Console.WriteLine($"    Returning work: {workCount} ThreadId: {Thread.CurrentThread.ManagedThreadId}");
                    yield return workCount;
                }
            }

            Thread.Sleep(100);
            System.Console.Write($".");
        }
    }

    System.Threading.Tasks.Parallel.ForEach(getWorkItems(),
    new System.Threading.Tasks.ParallelOptions { MaxDegreeOfParallelism = 1 },
    (workItem) =>
    {
        lock(lockOnMe)
        {
            System.Console.WriteLine($"      Parallel start:  {workItem}  ThreadId: {Thread.CurrentThread.ManagedThreadId}");
            Interlocked.Decrement(ref workCount);
            System.Console.WriteLine($"      Parallel finish: {workItem}");
            inProcess = false;

        }
    });

    System.Console.WriteLine($"=================== Finished ===================\r\n");
}

输出:

Starting ThreadId: 1
Returning work: 40 ThreadId: 1
  Parallel start:  40  ThreadId: 1
  Parallel finish: 40
Returning work: 39 ThreadId: 1
  Parallel start:  39  ThreadId: 1
  Parallel finish: 39
Returning work: 38 ThreadId: 1
  Parallel start:  38  ThreadId: 1
  Parallel finish: 38
Returning work: 37 ThreadId: 1
......................

2 个答案:

答案 0 :(得分:2)

您进入的状态是,在inProcess设置为false之后,您在一个线程上返回工作计数,而在另一个线程上,则使您陷入无限循环。并行性保护措施是针对从枚举返回的项目,而不是将生产者和消费者隔离开来。如果您想让它正常工作,则需要在过程或工作计数中获得或设置的所有位置都加锁。

如果您提高了并发级别,那么工作计数可能是错误的。

编辑

不起作用的原因是Parallel.foreach用于创建Partitioner的默认选项允许缓冲。如果您自己创建分区程序并禁止缓冲,则可以按预期工作。基本上,分区程序中有一种试探法,可以继续运行并缓存Ienumberable的返回,这破坏了此处的逻辑。

如果您希望它按预期工作,请执行以下操作。

    private static void Main(string[] args)
    {
        // Any value > 3 here causes Parallel.ForEach to hang on the yield return


        var partitioner = Partitioner.Create(getWorkItems(), EnumerablePartitionerOptions.NoBuffering);
        System.Threading.Tasks.Parallel.ForEach(partitioner,
            new System.Threading.Tasks.ParallelOptions { MaxDegreeOfParallelism = 1  },
            (workItem) =>
            {
                System.Console.WriteLine($"      Parallel start:  {workItem}");
                workCount--;
                System.Console.WriteLine($"      Parallel finish: {workItem}");
                inProcess = false;
            });

        System.Console.WriteLine($"=================== Finished ===================\r\n");
        var s = System.Console.ReadLine();
    }

答案 1 :(得分:1)

没有完整的答案,但这是我到目前为止所学到的:

Parallel.ForEach的行为不像传统的多线程方法,共享代码不会像示例代码那样麻烦。即使在保护共享数据以防止并发访问的情况下,如果并行线程需要互锁(例如:按顺序处理对象),ForEach逻辑上似乎也会发生优化,但效果仍然不佳。

要了解一下Parallel.ForEach发生的怪异现象,请尝试运行以下代码段:

static void Run()
{
    System.Collections.Generic.IEnumerable<int> getWorkItems()
    {
        int workCount = 9999;
        while (workCount > 0)
        {
            System.Console.Write($"R");
            yield return workCount--;
            Thread.Sleep(10);
        }
    }

    System.Threading.Tasks.Parallel.ForEach(
        getWorkItems(),
        (workItem) => System.Console.Write($"."));
}

输出:

R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.
RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR
..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..RR..
RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR..
..RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR
....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RR
RR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....RRRR....
RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR......
..RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR....
....RRRRRRRR........RRRRRRRR........R.RRRRRRRR........RRRRRRRR........RRRRRRRR
........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRR
RR........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRR
RRRR........R.RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........
RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR......
..RRRRRRRR........RRRRRRRR........RRRRRRRR........RRRRRRRR........R.RRRRRRRR..
......RRRRRRRRRRRR...

...等等。我没有对此行为的解释,但我猜测是Interator的处理程序正在尝试优化它缓冲输入的量。无论如何,此行为都会导致试图在通用对象上进行同步的代码造成严重破坏。

故事的寓意:如果您的处理完全是并行的并且不需要同步,请使用parallel.Foreach。如果需要同步,请尝试其他方法,例如将数据预先收集到数组中,或者编写自己的多线程处理程序。