使用Batchblock.Triggerbatch()

时间:2015-09-22 12:48:57

标签: c# pipeline throttling tpl-dataflow

在我的Producer-Consumer场景中,我有多个消费者,每个消费者都会向外部硬件发送一个动作,这可能需要一些时间。我的管道看起来有点像这样:

BatchBlock - > TransformBlock - > BufferBlock - > (几个)ActionBlocks

我已将ActionBlocks的BoundedCapacity指定为1。 理论上我想要的是,我想触发Batchblock只有当我的一个Actionblock可用于操作时才将一组项目发送到Transformblock。然后,Batchblock应该只保留缓冲元素,而不是将它们传递给Transformblock。我的批量大小是可变的。由于Batchsize是强制性的,我确实有一个非常高的BatchBlock批量大小的上限,但是我真的不希望达到这个限制,我想根据Actionblocks的可用性来触发我的批次,从而完成所述任务

我已经在Triggerbatch()方法的帮助下实现了这一点。我正在调用Batchblock.Triggerbatch()作为我的ActionBlock中的最后一个操作。但有趣的是,经过几天的正常工作后,管道变得很糟糕。在检查后,我发现有时批处理块的输入在ActionBlocks完成其工作后进入。在这种情况下,ActionBlocks在其工作结束时实际上会调用Triggerbatch,但是由于此时根本没有对Batchblock的输入,因此对TriggerBatch的调用毫无结果。一段时间后,当输入流入Batchblock时,没有人可以调用TriggerBatch并重新启动Pipeline。我正在寻找能够检查Batchblock的inputbuffer中是否存在某些东西的东西,但是没有这样的功能可用,我也找不到检查TriggerBatch是否富有成效的方法。

有人可以建议我的问题的可能解决方案。不幸的是,使用Timer来触发磁棒不是我的选择。除了管道的启动之外,限制应仅受其中一个ActionBlocks的可用性控制。

示例代码在这里:

    static BatchBlock<int> _groupReadTags;

    static void Main(string[] args)
    {
        _groupReadTags = new BatchBlock<int>(1000);

        var bufferOptions = new DataflowBlockOptions{BoundedCapacity = 2};
        BufferBlock<int> _frameBuffer = new BufferBlock<int>(bufferOptions);
        var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1};
        int batchNo = 1;


        TransformBlock<int[], int> _workingBlock = new TransformBlock<int[], int>(list =>
        {

            Console.WriteLine("\n\nWorking on Batch Number {0}", batchNo);
            //_groupReadTags.TriggerBatch();
            int sum = 0;

            foreach (int item in list)
            {
                Console.WriteLine("Elements in batch {0} :: {1}", batchNo, item);
                sum += item;

            }
            batchNo++;
            return sum;

        });

            ActionBlock<int> _worker1 = new ActionBlock<int>(async x =>
            {
                Console.WriteLine("Number from ONE :{0}",x);
                await Task.Delay(500);

                    Console.WriteLine("BatchBlock Output Count : {0}", _groupReadTags.OutputCount);

                _groupReadTags.TriggerBatch();



        },consumerOptions);

        ActionBlock<int> _worker2 = new ActionBlock<int>(async x =>
        {
            Console.WriteLine("Number from TWO :{0}", x);
            await Task.Delay(2000);
            _groupReadTags.TriggerBatch();

        }, consumerOptions);

        _groupReadTags.LinkTo(_workingBlock);
        _workingBlock.LinkTo(_frameBuffer);
        _frameBuffer.LinkTo(_worker1);
        _frameBuffer.LinkTo(_worker2);

        _groupReadTags.Post(10);
        _groupReadTags.Post(20);
        _groupReadTags.TriggerBatch();

        Task postingTask = new Task(() => PostStuff());
        postingTask.Start();
        Console.ReadLine();

    }



    static void PostStuff()
    {


        for (int i = 0; i < 10; i++)
            {
                _groupReadTags.Post(i);
                Thread.Sleep(100);
            }

        Parallel.Invoke(
            () => _groupReadTags.Post(100),
            () => _groupReadTags.Post(200),
            () => _groupReadTags.Post(300),
            () => _groupReadTags.Post(400),
            () => _groupReadTags.Post(500),
            () => _groupReadTags.Post(600),
            () => _groupReadTags.Post(700),
            () => _groupReadTags.Post(800)
                       );
    }

2 个答案:

答案 0 :(得分:0)

我发现以这种方式使用TriggerBatch是不可靠的:

    _groupReadTags.Post(10);
    _groupReadTags.Post(20);
    _groupReadTags.TriggerBatch();

显然TriggerBatch打算在块中使用,而不是像这样在它之外。我已经在奇怪的计时问题中看到了这个结果,就像当前批次中包含下一批次批次的项目一样,即使首先调用了TriggerBatch。

请使用DataflowBlock.EncapsulateBatchBlock produces batch with elements sent after TriggerBatch()

查看我对这个问题的回答

答案 1 :(得分:0)

这是另一种BatchBlock实现,具有一些额外的功能。它包括具有以下签名的var casper = require('casper').create(); var url = 'https://m.elmenu.app/elmenu.html?establecimiento=2' casper.start(url, function() { console.log(this.getHTML()); }); casper.run() 方法:

TriggerBatch

如果输入队列不为空,则调用此方法将立即触发一个批处理,否则将设置一个临时public int TriggerBatch(int nextMinBatchSizeIfEmpty); ,它将仅影响下一个批处理。您可以为MinBatchSize用较小的值调用此方法,以确保万一当前无法生产批次,下一个批次将比在块的构造函数中配置的nextMinBatchSizeIfEmpty早。

此方法返回产生的批次的大小。如果输入队列为空,或者输出队列已满,或者块已完成,则返回BatchSize

0

public class BatchBlockEx<T> : ITargetBlock<T>, ISourceBlock<T[]> { private readonly ITargetBlock<T> _input; private readonly IPropagatorBlock<T[], T[]> _output; private readonly Queue<T> _queue; private readonly object _locker = new object(); private int _nextMinBatchSize = Int32.MaxValue; public Task Completion { get; } public int InputCount { get { lock (_locker) return _queue.Count; } } public int OutputCount => ((BufferBlock<T[]>)_output).Count; public int BatchSize { get; } public BatchBlockEx(int batchSize, DataflowBlockOptions dataflowBlockOptions = null) { if (batchSize < 1) throw new ArgumentOutOfRangeException(nameof(batchSize)); dataflowBlockOptions = dataflowBlockOptions ?? new DataflowBlockOptions(); if (dataflowBlockOptions.BoundedCapacity != DataflowBlockOptions.Unbounded && dataflowBlockOptions.BoundedCapacity < batchSize) throw new ArgumentOutOfRangeException(nameof(batchSize), "Number must be no greater than the value specified in BoundedCapacity."); this.BatchSize = batchSize; _output = new BufferBlock<T[]>(dataflowBlockOptions); _queue = new Queue<T>(batchSize); _input = new ActionBlock<T>(async item => { T[] batch = null; lock (_locker) { _queue.Enqueue(item); if (_queue.Count == batchSize || _queue.Count >= _nextMinBatchSize) { batch = _queue.ToArray(); _queue.Clear(); _nextMinBatchSize = Int32.MaxValue; } } if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false); }, new ExecutionDataflowBlockOptions() { BoundedCapacity = 1, CancellationToken = dataflowBlockOptions.CancellationToken }); var inputContinuation = _input.Completion.ContinueWith(async t => { try { T[] batch = null; lock (_locker) { if (_queue.Count > 0) { batch = _queue.ToArray(); _queue.Clear(); } } if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false); } finally { if (t.IsFaulted) { _output.Fault(t.Exception.InnerException); } else { _output.Complete(); } } }, TaskScheduler.Default).Unwrap(); this.Completion = Task.WhenAll(inputContinuation, _output.Completion); } public void Complete() => _input.Complete(); void IDataflowBlock.Fault(Exception ex) => _input.Fault(ex); public int TriggerBatch(Func<T[], bool> condition, int nextMinBatchSizeIfEmpty) { if (nextMinBatchSizeIfEmpty < 1) throw new ArgumentOutOfRangeException(nameof(nextMinBatchSizeIfEmpty)); int count = 0; lock (_locker) { if (_queue.Count > 0) { T[] batch = _queue.ToArray(); if (condition == null || condition(batch)) { bool accepted = _output.Post(batch); if (accepted) { _queue.Clear(); count = batch.Length; } } _nextMinBatchSize = Int32.MaxValue; } else { _nextMinBatchSize = nextMinBatchSizeIfEmpty; } } return count; } public int TriggerBatch(Func<T[], bool> condition) => TriggerBatch(condition, Int32.MaxValue); public int TriggerBatch(int nextMinBatchSizeIfEmpty) => TriggerBatch(null, nextMinBatchSizeIfEmpty); public int TriggerBatch() => TriggerBatch(null, Int32.MaxValue); DataflowMessageStatus ITargetBlock<T>.OfferMessage( DataflowMessageHeader messageHeader, T messageValue, ISourceBlock<T> source, bool consumeToAccept) { return _input.OfferMessage(messageHeader, messageValue, source, consumeToAccept); } T[] ISourceBlock<T[]>.ConsumeMessage(DataflowMessageHeader messageHeader, ITargetBlock<T[]> target, out bool messageConsumed) { return _output.ConsumeMessage(messageHeader, target, out messageConsumed); } bool ISourceBlock<T[]>.ReserveMessage(DataflowMessageHeader messageHeader, ITargetBlock<T[]> target) { return _output.ReserveMessage(messageHeader, target); } void ISourceBlock<T[]>.ReleaseReservation(DataflowMessageHeader messageHeader, ITargetBlock<T[]> target) { _output.ReleaseReservation(messageHeader, target); } IDisposable ISourceBlock<T[]>.LinkTo(ITargetBlock<T[]> target, DataflowLinkOptions linkOptions) { return _output.LinkTo(target, linkOptions); } } 方法的另一个重载允许检查当前可以生产的批次,并确定是否应触发该批次:

TriggerBatch

public int TriggerBatch(Func<T[], bool> condition); 类不支持内置BatchBlockEx的{​​{3}}和Greedy选项。