Question

在我的Producer-Consumer场景中，我有多个消费者，每个消费者都会向外部硬件发送一个动作，这可能需要一些时间。我的管道看起来有点像这样：

BatchBlock - ＆gt; TransformBlock - ＆gt; BufferBlock - ＆gt; （几个）ActionBlocks

我已将ActionBlocks的BoundedCapacity指定为1。理论上我想要的是，我想触发Batchblock只有当我的一个Actionblock可用于操作时才将一组项目发送到Transformblock。然后，Batchblock应该只保留缓冲元素，而不是将它们传递给Transformblock。我的批量大小是可变的。由于Batchsize是强制性的，我确实有一个非常高的BatchBlock批量大小的上限，但是我真的不希望达到这个限制，我想根据Actionblocks的可用性来触发我的批次，从而完成所述任务

我已经在Triggerbatch（）方法的帮助下实现了这一点。我正在调用Batchblock.Triggerbatch（）作为我的ActionBlock中的最后一个操作。但有趣的是，经过几天的正常工作后，管道变得很糟糕。在检查后，我发现有时批处理块的输入在ActionBlocks完成其工作后进入。在这种情况下，ActionBlocks在其工作结束时实际上会调用Triggerbatch，但是由于此时根本没有对Batchblock的输入，因此对TriggerBatch的调用毫无结果。一段时间后，当输入流入Batchblock时，没有人可以调用TriggerBatch并重新启动Pipeline。我正在寻找能够检查Batchblock的inputbuffer中是否存在某些东西的东西，但是没有这样的功能可用，我也找不到检查TriggerBatch是否富有成效的方法。

有人可以建议我的问题的可能解决方案。不幸的是，使用Timer来触发磁棒不是我的选择。除了管道的启动之外，限制应仅受其中一个ActionBlocks的可用性控制。

示例代码在这里：

    static BatchBlock<int> _groupReadTags;

    static void Main(string[] args)
    {
        _groupReadTags = new BatchBlock<int>(1000);

        var bufferOptions = new DataflowBlockOptions{BoundedCapacity = 2};
        BufferBlock<int> _frameBuffer = new BufferBlock<int>(bufferOptions);
        var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1};
        int batchNo = 1;


        TransformBlock<int[], int> _workingBlock = new TransformBlock<int[], int>(list =>
        {

            Console.WriteLine("\n\nWorking on Batch Number {0}", batchNo);
            //_groupReadTags.TriggerBatch();
            int sum = 0;

            foreach (int item in list)
            {
                Console.WriteLine("Elements in batch {0} :: {1}", batchNo, item);
                sum += item;

            }
            batchNo++;
            return sum;

        });

            ActionBlock<int> _worker1 = new ActionBlock<int>(async x =>
            {
                Console.WriteLine("Number from ONE :{0}",x);
                await Task.Delay(500);

                    Console.WriteLine("BatchBlock Output Count : {0}", _groupReadTags.OutputCount);

                _groupReadTags.TriggerBatch();



        },consumerOptions);

        ActionBlock<int> _worker2 = new ActionBlock<int>(async x =>
        {
            Console.WriteLine("Number from TWO :{0}", x);
            await Task.Delay(2000);
            _groupReadTags.TriggerBatch();

        }, consumerOptions);

        _groupReadTags.LinkTo(_workingBlock);
        _workingBlock.LinkTo(_frameBuffer);
        _frameBuffer.LinkTo(_worker1);
        _frameBuffer.LinkTo(_worker2);

        _groupReadTags.Post(10);
        _groupReadTags.Post(20);
        _groupReadTags.TriggerBatch();

        Task postingTask = new Task(() => PostStuff());
        postingTask.Start();
        Console.ReadLine();

    }



    static void PostStuff()
    {


        for (int i = 0; i < 10; i++)
            {
                _groupReadTags.Post(i);
                Thread.Sleep(100);
            }

        Parallel.Invoke(
            () => _groupReadTags.Post(100),
            () => _groupReadTags.Post(200),
            () => _groupReadTags.Post(300),
            () => _groupReadTags.Post(400),
            () => _groupReadTags.Post(500),
            () => _groupReadTags.Post(600),
            () => _groupReadTags.Post(700),
            () => _groupReadTags.Post(800)
                       );
    }

Answer 1

我发现以这种方式使用TriggerBatch是不可靠的：

    _groupReadTags.Post(10);
    _groupReadTags.Post(20);
    _groupReadTags.TriggerBatch();

显然TriggerBatch打算在块中使用，而不是像这样在它之外。我已经在奇怪的计时问题中看到了这个结果，就像当前批次中包含下一批次批次的项目一样，即使首先调用了TriggerBatch。

请使用DataflowBlock.Encapsulate：BatchBlock produces batch with elements sent after TriggerBatch()

查看我对这个问题的回答

Answer 2

这是另一种BatchBlock实现，具有一些额外的功能。它包括具有以下签名的var casper = require('casper').create(); var url = 'https://m.elmenu.app/elmenu.html?establecimiento=2' casper.start(url, function() { console.log(this.getHTML()); }); casper.run()方法：

TriggerBatch

如果输入队列不为空，则调用此方法将立即触发一个批处理，否则将设置一个临时public int TriggerBatch(int nextMinBatchSizeIfEmpty);，它将仅影响下一个批处理。您可以为MinBatchSize用较小的值调用此方法，以确保万一当前无法生产批次，下一个批次将比在块的构造函数中配置的nextMinBatchSizeIfEmpty早。

此方法返回产生的批次的大小。如果输入队列为空，或者输出队列已满，或者块已完成，则返回BatchSize。

public class BatchBlockEx<T> : ITargetBlock<T>, ISourceBlock<T[]> { private readonly ITargetBlock<T> _input; private readonly IPropagatorBlock<T[], T[]> _output; private readonly Queue<T> _queue; private readonly object _locker = new object(); private int _nextMinBatchSize = Int32.MaxValue; public Task Completion { get; } public int InputCount { get { lock (_locker) return _queue.Count; } } public int OutputCount => ((BufferBlock<T[]>)_output).Count; public int BatchSize { get; } public BatchBlockEx(int batchSize, DataflowBlockOptions dataflowBlockOptions = null) { if (batchSize < 1) throw new ArgumentOutOfRangeException(nameof(batchSize)); dataflowBlockOptions = dataflowBlockOptions ?? new DataflowBlockOptions(); if (dataflowBlockOptions.BoundedCapacity != DataflowBlockOptions.Unbounded && dataflowBlockOptions.BoundedCapacity < batchSize) throw new ArgumentOutOfRangeException(nameof(batchSize), "Number must be no greater than the value specified in BoundedCapacity."); this.BatchSize = batchSize; _output = new BufferBlock<T[]>(dataflowBlockOptions); _queue = new Queue<T>(batchSize); _input = new ActionBlock<T>(async item => { T[] batch = null; lock (_locker) { _queue.Enqueue(item); if (_queue.Count == batchSize || _queue.Count >= _nextMinBatchSize) { batch = _queue.ToArray(); _queue.Clear(); _nextMinBatchSize = Int32.MaxValue; } } if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false); }, new ExecutionDataflowBlockOptions() { BoundedCapacity = 1, CancellationToken = dataflowBlockOptions.CancellationToken }); var inputContinuation = _input.Completion.ContinueWith(async t => { try { T[] batch = null; lock (_locker) { if (_queue.Count > 0) { batch = _queue.ToArray(); _queue.Clear(); } } if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false); } finally { if (t.IsFaulted) { _output.Fault(t.Exception.InnerException); } else { _output.Complete(); } } }, TaskScheduler.Default).Unwrap(); this.Completion = Task.WhenAll(inputContinuation, _output.Completion); } public void Complete() => _input.Complete(); void IDataflowBlock.Fault(Exception ex) => _input.Fault(ex); public int TriggerBatch(Func<T[], bool> condition, int nextMinBatchSizeIfEmpty) { if (nextMinBatchSizeIfEmpty < 1) throw new ArgumentOutOfRangeException(nameof(nextMinBatchSizeIfEmpty)); int count = 0; lock (_locker) { if (_queue.Count > 0) { T[] batch = _queue.ToArray(); if (condition == null || condition(batch)) { bool accepted = _output.Post(batch); if (accepted) { _queue.Clear(); count = batch.Length; } } _nextMinBatchSize = Int32.MaxValue; } else { _nextMinBatchSize = nextMinBatchSizeIfEmpty; } } return count; } public int TriggerBatch(Func<T[], bool> condition) => TriggerBatch(condition, Int32.MaxValue); public int TriggerBatch(int nextMinBatchSizeIfEmpty) => TriggerBatch(null, nextMinBatchSizeIfEmpty); public int TriggerBatch() => TriggerBatch(null, Int32.MaxValue); DataflowMessageStatus ITargetBlock<T>.OfferMessage( DataflowMessageHeader messageHeader, T messageValue, ISourceBlock<T> source, bool consumeToAccept) { return _input.OfferMessage(messageHeader, messageValue, source, consumeToAccept); } T[] ISourceBlock<T[]>.ConsumeMessage(DataflowMessageHeader messageHeader, ITargetBlock<T[]> target, out bool messageConsumed) { return _output.ConsumeMessage(messageHeader, target, out messageConsumed); } bool ISourceBlock<T[]>.ReserveMessage(DataflowMessageHeader messageHeader, ITargetBlock<T[]> target) { return _output.ReserveMessage(messageHeader, target); } void ISourceBlock<T[]>.ReleaseReservation(DataflowMessageHeader messageHeader, ITargetBlock<T[]> target) { _output.ReleaseReservation(messageHeader, target); } IDisposable ISourceBlock<T[]>.LinkTo(ITargetBlock<T[]> target, DataflowLinkOptions linkOptions) { return _output.LinkTo(target, linkOptions); } }方法的另一个重载允许检查当前可以生产的批次，并确定是否应触发该批次：

TriggerBatch

public int TriggerBatch(Func<T[], bool> condition);类不支持内置BatchBlockEx的{{3}}和Greedy选项。

使用Batchblock.Triggerbatch（）

2 个答案: