在我的Producer-Consumer场景中,我有多个消费者,每个消费者都会向外部硬件发送一个动作,这可能需要一些时间。我的管道看起来有点像这样:
BatchBlock - > TransformBlock - > BufferBlock - > (几个)ActionBlocks
我已将ActionBlocks的BoundedCapacity指定为1。 理论上我想要的是,我想触发Batchblock只有当我的一个Actionblock可用于操作时才将一组项目发送到Transformblock。然后,Batchblock应该只保留缓冲元素,而不是将它们传递给Transformblock。我的批量大小是可变的。由于Batchsize是强制性的,我确实有一个非常高的BatchBlock批量大小的上限,但是我真的不希望达到这个限制,我想根据Actionblocks的可用性来触发我的批次,从而完成所述任务
我已经在Triggerbatch()方法的帮助下实现了这一点。我正在调用Batchblock.Triggerbatch()作为我的ActionBlock中的最后一个操作。但有趣的是,经过几天的正常工作后,管道变得很糟糕。在检查后,我发现有时批处理块的输入在ActionBlocks完成其工作后进入。在这种情况下,ActionBlocks在其工作结束时实际上会调用Triggerbatch,但是由于此时根本没有对Batchblock的输入,因此对TriggerBatch的调用毫无结果。一段时间后,当输入流入Batchblock时,没有人可以调用TriggerBatch并重新启动Pipeline。我正在寻找能够检查Batchblock的inputbuffer中是否存在某些东西的东西,但是没有这样的功能可用,我也找不到检查TriggerBatch是否富有成效的方法。
有人可以建议我的问题的可能解决方案。不幸的是,使用Timer来触发磁棒不是我的选择。除了管道的启动之外,限制应仅受其中一个ActionBlocks的可用性控制。
示例代码在这里:
static BatchBlock<int> _groupReadTags;
static void Main(string[] args)
{
_groupReadTags = new BatchBlock<int>(1000);
var bufferOptions = new DataflowBlockOptions{BoundedCapacity = 2};
BufferBlock<int> _frameBuffer = new BufferBlock<int>(bufferOptions);
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1};
int batchNo = 1;
TransformBlock<int[], int> _workingBlock = new TransformBlock<int[], int>(list =>
{
Console.WriteLine("\n\nWorking on Batch Number {0}", batchNo);
//_groupReadTags.TriggerBatch();
int sum = 0;
foreach (int item in list)
{
Console.WriteLine("Elements in batch {0} :: {1}", batchNo, item);
sum += item;
}
batchNo++;
return sum;
});
ActionBlock<int> _worker1 = new ActionBlock<int>(async x =>
{
Console.WriteLine("Number from ONE :{0}",x);
await Task.Delay(500);
Console.WriteLine("BatchBlock Output Count : {0}", _groupReadTags.OutputCount);
_groupReadTags.TriggerBatch();
},consumerOptions);
ActionBlock<int> _worker2 = new ActionBlock<int>(async x =>
{
Console.WriteLine("Number from TWO :{0}", x);
await Task.Delay(2000);
_groupReadTags.TriggerBatch();
}, consumerOptions);
_groupReadTags.LinkTo(_workingBlock);
_workingBlock.LinkTo(_frameBuffer);
_frameBuffer.LinkTo(_worker1);
_frameBuffer.LinkTo(_worker2);
_groupReadTags.Post(10);
_groupReadTags.Post(20);
_groupReadTags.TriggerBatch();
Task postingTask = new Task(() => PostStuff());
postingTask.Start();
Console.ReadLine();
}
static void PostStuff()
{
for (int i = 0; i < 10; i++)
{
_groupReadTags.Post(i);
Thread.Sleep(100);
}
Parallel.Invoke(
() => _groupReadTags.Post(100),
() => _groupReadTags.Post(200),
() => _groupReadTags.Post(300),
() => _groupReadTags.Post(400),
() => _groupReadTags.Post(500),
() => _groupReadTags.Post(600),
() => _groupReadTags.Post(700),
() => _groupReadTags.Post(800)
);
}
答案 0 :(得分:0)
我发现以这种方式使用TriggerBatch
是不可靠的:
_groupReadTags.Post(10);
_groupReadTags.Post(20);
_groupReadTags.TriggerBatch();
显然TriggerBatch
打算在块中使用,而不是像这样在它之外。我已经在奇怪的计时问题中看到了这个结果,就像当前批次中包含下一批次批次的项目一样,即使首先调用了TriggerBatch。
请使用DataflowBlock.Encapsulate
:BatchBlock produces batch with elements sent after TriggerBatch()
答案 1 :(得分:0)
这是另一种BatchBlock
实现,具有一些额外的功能。它包括具有以下签名的var casper = require('casper').create();
var url = 'https://m.elmenu.app/elmenu.html?establecimiento=2'
casper.start(url, function() {
console.log(this.getHTML());
});
casper.run()
方法:
TriggerBatch
如果输入队列不为空,则调用此方法将立即触发一个批处理,否则将设置一个临时public int TriggerBatch(int nextMinBatchSizeIfEmpty);
,它将仅影响下一个批处理。您可以为MinBatchSize
用较小的值调用此方法,以确保万一当前无法生产批次,下一个批次将比在块的构造函数中配置的nextMinBatchSizeIfEmpty
早。
此方法返回产生的批次的大小。如果输入队列为空,或者输出队列已满,或者块已完成,则返回BatchSize
。
0
public class BatchBlockEx<T> : ITargetBlock<T>, ISourceBlock<T[]>
{
private readonly ITargetBlock<T> _input;
private readonly IPropagatorBlock<T[], T[]> _output;
private readonly Queue<T> _queue;
private readonly object _locker = new object();
private int _nextMinBatchSize = Int32.MaxValue;
public Task Completion { get; }
public int InputCount { get { lock (_locker) return _queue.Count; } }
public int OutputCount => ((BufferBlock<T[]>)_output).Count;
public int BatchSize { get; }
public BatchBlockEx(int batchSize, DataflowBlockOptions dataflowBlockOptions = null)
{
if (batchSize < 1) throw new ArgumentOutOfRangeException(nameof(batchSize));
dataflowBlockOptions = dataflowBlockOptions ?? new DataflowBlockOptions();
if (dataflowBlockOptions.BoundedCapacity != DataflowBlockOptions.Unbounded &&
dataflowBlockOptions.BoundedCapacity < batchSize)
throw new ArgumentOutOfRangeException(nameof(batchSize),
"Number must be no greater than the value specified in BoundedCapacity.");
this.BatchSize = batchSize;
_output = new BufferBlock<T[]>(dataflowBlockOptions);
_queue = new Queue<T>(batchSize);
_input = new ActionBlock<T>(async item =>
{
T[] batch = null;
lock (_locker)
{
_queue.Enqueue(item);
if (_queue.Count == batchSize || _queue.Count >= _nextMinBatchSize)
{
batch = _queue.ToArray(); _queue.Clear();
_nextMinBatchSize = Int32.MaxValue;
}
}
if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false);
}, new ExecutionDataflowBlockOptions()
{
BoundedCapacity = 1,
CancellationToken = dataflowBlockOptions.CancellationToken
});
var inputContinuation = _input.Completion.ContinueWith(async t =>
{
try
{
T[] batch = null;
lock (_locker)
{
if (_queue.Count > 0)
{
batch = _queue.ToArray(); _queue.Clear();
}
}
if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false);
}
finally
{
if (t.IsFaulted)
{
_output.Fault(t.Exception.InnerException);
}
else
{
_output.Complete();
}
}
}, TaskScheduler.Default).Unwrap();
this.Completion = Task.WhenAll(inputContinuation, _output.Completion);
}
public void Complete() => _input.Complete();
void IDataflowBlock.Fault(Exception ex) => _input.Fault(ex);
public int TriggerBatch(Func<T[], bool> condition, int nextMinBatchSizeIfEmpty)
{
if (nextMinBatchSizeIfEmpty < 1)
throw new ArgumentOutOfRangeException(nameof(nextMinBatchSizeIfEmpty));
int count = 0;
lock (_locker)
{
if (_queue.Count > 0)
{
T[] batch = _queue.ToArray();
if (condition == null || condition(batch))
{
bool accepted = _output.Post(batch);
if (accepted) { _queue.Clear(); count = batch.Length; }
}
_nextMinBatchSize = Int32.MaxValue;
}
else
{
_nextMinBatchSize = nextMinBatchSizeIfEmpty;
}
}
return count;
}
public int TriggerBatch(Func<T[], bool> condition)
=> TriggerBatch(condition, Int32.MaxValue);
public int TriggerBatch(int nextMinBatchSizeIfEmpty)
=> TriggerBatch(null, nextMinBatchSizeIfEmpty);
public int TriggerBatch() => TriggerBatch(null, Int32.MaxValue);
DataflowMessageStatus ITargetBlock<T>.OfferMessage(
DataflowMessageHeader messageHeader, T messageValue,
ISourceBlock<T> source, bool consumeToAccept)
{
return _input.OfferMessage(messageHeader, messageValue, source,
consumeToAccept);
}
T[] ISourceBlock<T[]>.ConsumeMessage(DataflowMessageHeader messageHeader,
ITargetBlock<T[]> target, out bool messageConsumed)
{
return _output.ConsumeMessage(messageHeader, target, out messageConsumed);
}
bool ISourceBlock<T[]>.ReserveMessage(DataflowMessageHeader messageHeader,
ITargetBlock<T[]> target)
{
return _output.ReserveMessage(messageHeader, target);
}
void ISourceBlock<T[]>.ReleaseReservation(DataflowMessageHeader messageHeader,
ITargetBlock<T[]> target)
{
_output.ReleaseReservation(messageHeader, target);
}
IDisposable ISourceBlock<T[]>.LinkTo(ITargetBlock<T[]> target,
DataflowLinkOptions linkOptions)
{
return _output.LinkTo(target, linkOptions);
}
}
方法的另一个重载允许检查当前可以生产的批次,并确定是否应触发该批次:
TriggerBatch
public int TriggerBatch(Func<T[], bool> condition);
类不支持内置BatchBlockEx
的{{3}}和Greedy
选项。