我正在处理TPL数据流管道,并注意到与TransformManyBlock
中的排序/并行性有关的一些奇怪行为(可能也适用于其他块)。
这是我要复制的代码(.NET 4.7.2,TPL Dataflow 4.9.0):
class Program
{
static void Main(string[] args)
{
var sourceBlock = new TransformManyBlock<int, Tuple<int, int>>(i => Source(i),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4, EnsureOrdered = false });
var targetBlock = new ActionBlock<Tuple<int, int>>(tpl =>
{
Console.WriteLine($"Received ({tpl.Item1}, {tpl.Item2})");
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4, EnsureOrdered = true });
sourceBlock.LinkTo(targetBlock, new DataflowLinkOptions { PropagateCompletion = true });
for (int i = 0; i < 10; i++)
{
sourceBlock.Post(i);
}
sourceBlock.Complete();
targetBlock.Completion.Wait();
Console.WriteLine("Finished");
Console.Read();
}
static IEnumerable<Tuple<int, int>> Source(int i)
{
var rand = new Random(543543254);
for (int j = 0; j < i; j++)
{
Thread.Sleep(rand.Next(100, 1500));
Console.WriteLine($"Returning ({i}, {j})");
yield return Tuple.Create(i, j);
}
}
}
我想要的行为如下:
j
进行排序。据我了解,yield return
的性质满足了二次排序条件,因此可以将EnsureOrdered
设置为false
。如果将其设置为true
,则源块将在一段不可接受的时间内保留消息,因为它将等待所有yield return
完成,然后再传递消息(在实际应用中,许多GB的数据已处理,这意味着我们希望尽快通过管道传播数据,以便释放RAM。当源块的EnsureOrdered
设置为true
时,这是示例输出:
Returning (1, 0)
Returning (2, 0)
Returning (4, 0)
Returning (3, 0)
Returning (2, 1)
Returning (4, 1)
Returning (3, 1)
Received (1, 0)
Received (2, 0)
Received (2, 1)
Returning (4, 2)
Returning (3, 2)
Received (3, 0)
Received (3, 1)
Received (3, 2)
Returning (5, 0)
Returning (6, 0)
我们可以看到源块是并行工作的,但是等待传播消息,直到生成下一行i
的所有消息为止(如预期的那样)。
但是,当源块的EnsureOrdered
为false
时(如代码示例中所示),我得到以下输出:
Returning (2, 0)
Received (2, 0)
Returning (2, 1)
Received (2, 1)
Returning (4, 0)
Received (4, 0)
Returning (4, 1)
Received (4, 1)
Returning (4, 2)
Received (4, 2)
Returning (4, 3)
Received (4, 3)
Returning (1, 0)
Received (1, 0)
Returning (3, 0)
Received (3, 0)
Returning (3, 1)
Received (3, 1)
Returning (3, 2)
Received (3, 2)
源块在可用时成功传播了消息,但是似乎并行性丢失了,因为它一次只能处理一个i
。
这是为什么?如何强制它并行处理?