为什么WithMergeOptions(ParallelMergeOptions.NotBuffered)不立即提供结果?

时间:2013-01-29 03:07:04

标签: c# .net task-parallel-library plinq

(我目前仅限于.NET 4.0)

我的情况是我希望尽可能并行处理项目,必须维护订单,并且可以随时添加项目,直到按下“停止”。

项目可能会以“突发”进入,因此队列可能会完全耗尽,会有暂停,然后会有大量项目再次进入。

我希望结果一旦完成就可以使用。

这是一个简化的例子:

class Program
{
    static void Main(string[] args)
    {
        BlockingCollection<int> itemsQueue = new BlockingCollection<int>();

        Random random = new Random();

        var results = itemsQueue
        .GetConsumingEnumerable()
        .AsParallel()
        .AsOrdered()
        .WithMergeOptions(ParallelMergeOptions.NotBuffered)
        .Select(i =>
        {
            int work = 0;

            Console.WriteLine("Working on " + i);

            //simulate work
            for (int busy = 0; busy <= 90000000; ++busy) { ++work; };

            Console.WriteLine("Finished " + i);


            return i;
        });

        TaskCompletionSource<bool> completion = new TaskCompletionSource<bool>();

        Task.Factory.StartNew(() =>
        {
            foreach (int i in results)
            {
                Console.WriteLine("Result Available: " + i);
            }
            completion.SetResult(true);
        });

        int iterations;
        iterations = random.Next(5, 50);
        Console.WriteLine("------- iterations: " + iterations + "-------");

        for (int i = 1; i <= iterations; ++i)
        {
            itemsQueue.Add(i);
        }

        while (true)
        {
            char c = Console.ReadKey().KeyChar;

            if (c == 's')
            {
                break;
            }
            else
            {
                ++iterations;

                Console.WriteLine("adding: " + iterations);
                itemsQueue.Add(iterations);
            }
        }


        itemsQueue.CompleteAdding();

        completion.Task.Wait();

        Console.WriteLine("Done!");
        Console.ReadKey();
        itemsQueue.Dispose();
    }
}

正如上面的例子所显示的那样,通常会发生的结果是,结果将在最后几个结果之前可用(我不是100%肯定这一点,但它停止的结果数量可能大致相关使用框中的核心数),直到调用itemsQueue.CompleteAdding();(在示例中,按下“s”键),此时其余结果将最终可用。

尽管我指定了.WithMergeOptions(ParallelMergeOptions.NotBuffered),为什么结果不会立即可用,我怎样才能让它们立即可用?

1 个答案:

答案 0 :(得分:2)

请注意,如果您可以调用BlockingQueue.CompleteAdding()实例方法,则问题不会出现问题,这会导致所有结果都完成。

简答

如果另一方面,您需要维持订单,并且需要尽快提供结果,并且您没有机会致电BlockingQueue.CompleteAdding(),那么如果在尽管如此,你最好不要让队列中的项目消耗不平行,而是将每个单独任务的处理并行化。

E.g。

  class Program
  {
    //Not parallel, but suitable for monitoring queue purposes, 
    //can then focus on parallelizing each individual task
    static void Main(string[] args)
    {
        BlockingCollection<int> itemsQueue = new BlockingCollection<int>();


        Random random = new Random();

        var results = itemsQueue.GetConsumingEnumerable()
        .Select(i =>
        {
            Console.WriteLine("Working on " + i);

            //Focus your parallelization efforts on the work of 
            //the individual task
            //E.g, simulated:
            double work = Enumerable.Range(0, 90000000 - (10 * (i % 3)))
            .AsParallel()
            .Select(w => w + 1)
            .Average();

            Console.WriteLine("Finished " + i);


            return i;
        });

        TaskCompletionSource<bool> completion = new TaskCompletionSource<bool>();

        Task.Factory.StartNew(() =>
        {
            foreach (int i in results)
            {
                Console.WriteLine("Result Available: " + i);
            }
            completion.SetResult(true);
        });

        int iterations;
        iterations = random.Next(5, 50);
        Console.WriteLine("------- iterations: " + iterations + "-------");

        for (int i = 1; i <= iterations; ++i)
        {
            itemsQueue.Add(i);
        }

        while (true)
        {
            char c = Console.ReadKey().KeyChar;

            if (c == 's')
            {
                break;
            }
            else
            {
                ++iterations;

                Console.WriteLine("adding: " + iterations);
                itemsQueue.Add(iterations);
            }
        }


        itemsQueue.CompleteAdding();

        completion.Task.Wait();

        Console.WriteLine("Done!");
        Console.ReadKey();
        itemsQueue.Dispose();
    }
}

更长的答案

特别是BlockingQueueAsOrderable()

之间似乎存在互动

似乎只要分区中的一个枚举器阻塞,AsOrderable就会停止处理任务。

默认分区程序将处理通常大于1的块 - 阻塞队列将阻塞,直到可以填充块(或填充CompleteAdding)。

但是,即使块大小为1,问题也不会完全消失。

要解决此问题,您有时可以在实现自己的分区程序时看到该行为。 (注意,如果你指定.WithDegreeOfParallelism(1),等待出现的结果的问题就会消失 - 但当然,有一定程度的并行性= 1种就会失败的目的!)

e.g。

public class ImmediateOrderedPartitioner<T> : OrderablePartitioner<T>
{
    private readonly IEnumerable<T> _consumingEnumerable;
    private readonly Ordering _ordering = new Ordering();

    public ImmediateOrderedPartitioner(BlockingCollection<T> collection) : base(true, true, true)
    {
        _consumingEnumerable = collection.GetConsumingEnumerable();
    }

    private class Ordering
    {
        public int Order = -1;
    }

    private class MyEnumerator<S> : IEnumerator<KeyValuePair<long, S>>
    {
        private readonly object _orderLock = new object();

        private readonly IEnumerable<S> _enumerable;

        private KeyValuePair<long, S> _current;

        private bool _hasItem;

        private Ordering _ordering;

        public MyEnumerator(IEnumerable<S> consumingEnumerable, Ordering ordering)
        {
            _enumerable = consumingEnumerable;
            _ordering = ordering;
        }

        public KeyValuePair<long, S> Current
        {
            get
            {
                if (_hasItem)
                {
                    return _current;
                }
                else
                    throw new InvalidOperationException();
            }
        }

        public void Dispose()
        {

        }

        object System.Collections.IEnumerator.Current
        {
            get 
            {
                return Current;
            }
        }

        public bool MoveNext()
        {
            lock (_orderLock)
            {
                bool canMoveNext = false;

                var next = _enumerable.Take(1).FirstOrDefault(s => { canMoveNext = true; return true; });

                if (canMoveNext)
                {
                    _current = new KeyValuePair<long, S>(++_ordering.Order, next);
                    _hasItem = true;
                    ++_ordering.Order;
                }
                else
                {
                    _hasItem = false;
                }

                return canMoveNext;
            }
        }

        public void Reset()
        {
            throw new NotSupportedException();
        }
    }

    public override IList<IEnumerator<KeyValuePair<long, T>>> GetOrderablePartitions(int partitionCount)
    {
        var result = new List<IEnumerator<KeyValuePair<long,T>>>();

        //for (int i = 0; i < partitionCount; ++i)
        //{
        //    result.Add(new MyEnumerator<T>(_consumingEnumerable, _ordering));
        //}

        //share the enumerator between partitions in this case to maintain
        //the proper locking on ordering.
        var enumerator = new MyEnumerator<T>(_consumingEnumerable, _ordering);

        for (int i = 0; i < partitionCount; ++i)
        {
            result.Add(enumerator);
        }

        return result;
    }

    public override bool SupportsDynamicPartitions
    {
        get
        {
            return false;
        }
    }

    public override IEnumerable<T> GetDynamicPartitions()
    {
        throw new NotImplementedException();
        return base.GetDynamicPartitions();
    }

    public override IEnumerable<KeyValuePair<long, T>> GetOrderableDynamicPartitions()
    {
        throw new NotImplementedException();
        return base.GetOrderableDynamicPartitions();
    }

    public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
    {
        throw new NotImplementedException();
        return base.GetPartitions(partitionCount);
    }
}

class Program
{
    static void Main(string[] args)
    {
        BlockingCollection<int> itemsQueue = new BlockingCollection<int>();

        var partitioner = new ImmediateOrderedPartitioner<int>(itemsQueue);

        Random random = new Random();

        var results = partitioner
        .AsParallel()
        .AsOrdered()
        .WithMergeOptions(ParallelMergeOptions.NotBuffered)
        //.WithDegreeOfParallelism(1)
        .Select(i =>
        {
            int work = 0;

            Console.WriteLine("Working on " + i);

            for (int busy = 0; busy <= 90000000; ++busy) { ++work; };

            Console.WriteLine("Finished " + i);


            return i;
        });

        TaskCompletionSource<bool> completion = new TaskCompletionSource<bool>();

        Task.Factory.StartNew(() =>
        {
            foreach (int i in results)
            {
                Console.WriteLine("Result Available: " + i);
            }
            completion.SetResult(true);
        });

        int iterations;
        iterations = 1; // random.Next(5, 50);
        Console.WriteLine("------- iterations: " + iterations + "-------");

        for (int i = 1; i <= iterations; ++i)
        {
            itemsQueue.Add(i);
        }

        while (true)
        {
            char c = Console.ReadKey().KeyChar;

            if (c == 's')
            {
                break;
            }
            else
            {
                ++iterations;

                Console.WriteLine("adding: " + iterations);
                itemsQueue.Add(iterations);
            }
        }


        itemsQueue.CompleteAdding();

        completion.Task.Wait();

        Console.WriteLine("Done!");
        Console.ReadKey();
        itemsQueue.Dispose();
    }
}

替代方法 如果无法并行化单个任务(如#34;简短答案&#34中所建议的那样),并且所有其他问题约束都适用,那么您可以实现自己的队列类型,为每个项目分配任务 - 因此,让任务并行库处理工作的安排,但同步自己的结果消耗。

例如,类似下面的内容(标准&#34;没有保证&#34;免责声明!)

public class QueuedItem<TInput, TResult>
{
    private readonly object _lockObject = new object();

    private TResult _result;

    private readonly TInput _input;

    private readonly TResult _notfinished;

    internal readonly bool IsEndQueue = false;

    internal QueuedItem()
    {
        IsEndQueue = true;
    }

    public QueuedItem(TInput input, TResult notfinished)
    {
        _input = input;
        _notfinished = notfinished;
        _result = _notfinished;
    }

    public TResult ReadResult()
    {
        lock (_lockObject)
        {
            if (!IsResultReady)
                throw new InvalidOperationException("Check IsResultReady before calling ReadResult()");

            return _result;
        }
    }

    public void WriteResult(TResult value)
    {
        lock (_lockObject)
        {
            if (IsResultReady)
                throw new InvalidOperationException("Result has already been written");

            _result = value;
        }
    }

    public TInput Input { get { return _input; } }

    public bool IsResultReady
    {
        get
        {
            lock (_lockObject)
            {
                return !object.Equals(_result, _notfinished) || IsEndQueue;
            }
        }
    }
}


public class ParallelImmediateOrderedProcessingQueue<TInput, TResult>
{
    private readonly ReaderWriterLockSlim _addLock = new ReaderWriterLockSlim();

    private readonly object _readingResultsLock = new object();

    private readonly ConcurrentQueue<QueuedItem<TInput, TResult>> _concurrentQueue = new ConcurrentQueue<QueuedItem<TInput, TResult>>();

    bool _isFinishedAdding = false;

    private readonly TResult _notFinished;

    private readonly Action<QueuedItem<TInput, TResult>> _processor;

    /// <param name="notFinished">A value that indicates the result is not yet finished</param>
    /// <param name="processor">Must call SetResult() on argument when finished.</param>
    public ParallelImmediateOrderedProcessingQueue(TResult notFinished, Action<QueuedItem<TInput, TResult>> processor)
    {
        _notFinished = notFinished;
        _processor = processor;
    }

    public event Action ResultsReady = delegate { };

    private void SignalResult()
    {
            QueuedItem<TInput, TResult> item;
            if (_concurrentQueue.TryPeek(out item) && item.IsResultReady)
            {
                ResultsReady();
            }
    }

    public void Add(TInput input)
    {
        bool shouldThrow = false;

        _addLock.EnterReadLock();
        {
            shouldThrow = _isFinishedAdding;

            if (!shouldThrow)
            {
                var queuedItem = new QueuedItem<TInput, TResult>(input, _notFinished);

                _concurrentQueue.Enqueue(queuedItem);

                Task.Factory.StartNew(() => { _processor(queuedItem); SignalResult(); });
            }
        }
        _addLock.ExitReadLock();

        if (shouldThrow)
            throw new InvalidOperationException("An attempt was made to add an item, but adding items was marked as completed");
    }

    public IEnumerable<TResult> ConsumeReadyResults()
    {
        //lock necessary to preserve ordering
        lock (_readingResultsLock)
        {
            QueuedItem<TInput, TResult> queuedItem;

            while (_concurrentQueue.TryPeek(out queuedItem) && queuedItem.IsResultReady)
            {
                if (!_concurrentQueue.TryDequeue(out queuedItem))
                    throw new ApplicationException("this shouldn't happen");

                if (queuedItem.IsEndQueue)
                {
                    _completion.SetResult(true);
                }
                else
                {
                    yield return queuedItem.ReadResult();
                }
            }
        }
    }

    public void CompleteAddingItems()
    {
        _addLock.EnterWriteLock();
        {
            _isFinishedAdding = true;

            var queueCompletion = new QueuedItem<TInput, TResult>();

            _concurrentQueue.Enqueue(queueCompletion);
            Task.Factory.StartNew(() => { SignalResult(); });
        }
        _addLock.ExitWriteLock();
    }

    TaskCompletionSource<bool> _completion = new TaskCompletionSource<bool>();

    public void WaitForCompletion()
    {
        _completion.Task.Wait();
    }
}

class Program
{
    static void Main(string[] args)
    {
        const int notFinished = int.MinValue;

        var processingQueue = new ParallelImmediateOrderedProcessingQueue<int, int>(notFinished, qi =>
        {
            int work = 0;

            Console.WriteLine("Working on " + qi.Input);

            //simulate work
            int maxBusy = 90000000 - (10 * (qi.Input % 3));
            for (int busy = 0; busy <= maxBusy; ++busy) { ++work; };

            Console.WriteLine("Finished " + qi.Input);

            qi.WriteResult(qi.Input);
        });

        processingQueue.ResultsReady += new Action(() =>
        {
            Task.Factory.StartNew(() =>
                {
                    foreach (int result in processingQueue.ConsumeReadyResults())
                    {
                        Console.WriteLine("Results Available: " + result);
                    }
                });
        });


        int iterations = new Random().Next(5, 50);
        Console.WriteLine("------- iterations: " + iterations + "-------");

        for (int i = 1; i <= iterations; ++i)
        {
            processingQueue.Add(i);
        }

        while (true)
        {
            char c = Console.ReadKey().KeyChar;

            if (c == 's')
            {
                break;
            }
            else
            {
                ++iterations;

                Console.WriteLine("adding: " + iterations);
                processingQueue.Add(iterations);
            }
        }

        processingQueue.CompleteAddingItems();
        processingQueue.WaitForCompletion();

        Console.WriteLine("Done!");
        Console.ReadKey();
    }
}