我的程序使用迭代器遍历一个映射,并产生一些工作线程来处理读取迭代器中的 points ,这一切都很好。现在,我想为每个点编写输出,为此我使用内存缓冲区来确保在将文件写入文件之前以正确的顺序从线程收集数据(通过另一个迭代器进行写入) :
public class MapMain
{
// Multiple threads used here, each thread starts in Run()
// requests and processes map points
public void Run()
{
// Get point from somewhere and process point
int pointIndex = ...
bufferWriter.StartPoint(pointIndex);
// Perform a number of computations.
// For simplicity, numberOfComputations = 1 in this example
bufferedWriter.BufferValue(pointIndex, value);
bufferWriter.EndPoint(pointIndex);
}
}
我尝试实现缓冲区:
public class BufferWriter
{
private const int BufferSize = 4;
private readonly IIterator iterator;
private readonly float?[] bufferArray;
private readonly bool[] bufferingCompleted;
private readonly SortedDictionary<long, int> pointIndexToBufferIndexMap;
private readonly object syncObject = new object();
private int bufferCount = 0;
private int endBufferCount = 0;
public BufferWriter(....)
{
iterator = ...
bufferArray = new float?[BufferSize];
bufferingCompleted = new bool[BufferSize];
pointIndexToBufferIndexMap = new SortedDictionary<long, int>();
}
public void StartPoint(long pointIndex)
{
lock (syncObject)
{
if (bufferCount == BufferSize)
{
Monitor.Wait(syncObject);
}
pointIndexToBufferIndexMap.Add(pointIndex, bufferCount);
bufferCount++;
}
}
public void BufferValue(long pointIndex, float value)
{
lock (syncObject)
{
int bufferIndex = pointIndexToBufferIndexMap[pointIndex];
bufferArray[bufferIndex] = value;
}
}
public void EndPoint(long pointIndex)
{
lock (syncObject)
{
int bufferIndex = pointIndexToBufferIndexMap[pointIndex];
bufferingCompleted[bufferIndex] = true;
endBufferCount++;
if (endBufferCount == BufferSize)
{
FlushBuffer();
Monitor.PulseAll(syncObject);
}
}
}
private void FlushBuffer()
{
// Iterate in order of points
foreach (long pointIndex in pointIndexToBufferIndexMap.Keys)
{
// Move iterator
iterator.MoveNext();
int bufferIndex = pointIndexToBufferIndexMap[pointIndex];
if (bufferArray[bufferIndex].HasValue)
{
iterator.Current = bufferArray[bufferIndex];
// Clear to null
bufferArray[bufferIndex] = null;
}
}
bufferCount = 0;
endBufferCount = 0;
pointIndexToBufferIndexMap.Clear();
}
}
我正在寻找反馈来修复和纠正我的代码中的错误并解决任何性能问题:
[1]简而言之:我有一个固定大小的缓冲区,它以稍微随机的顺序从多个线程处理点收集数据。当缓冲区完全填满数据时,必须刷新它。但是,如果我收集0到9点但是第8点丢失了怎么办?我的缓冲区已经满了,尝试使用缓冲区的任何一点都会阻塞,直到执行刷新,这需要第8点。
[2]缓冲区中的值顺序与值所引用的映射点的顺序不对应。如果是这种情况,那么我认为刷新会更容易(数组访问比SortedDictionary检索时间更快?)。另外,这可能允许我们重新使用刷新的插槽来传入数据(循环缓冲区?)
但我想不出一个能够实现这一目标的工作模式。
[3]缓冲液在冲洗前等待直至完全充满。在许多情况下,线程调用EndPoint()
并且iterator.Current
碰巧引用该点。
对于那一点,立即“写”(即调用'iterator.Current'并枚举一次)可能更有意义,但是如何做到这一点呢?
为了清楚起见,BufferWriter中的写入iterator
在其自己的级别上有一个缓冲区来缓存在其Current
属性上调用的值,然后写入输出,但我不必担心它
我觉得整件事需要从头开始重写!
感谢任何帮助,谢谢。
答案 0 :(得分:1)
我不会“手动”执行并行操作,将其移植到TPL或PLINQ。既然您正在谈论地图,那么您可以通过坐标枚举固定的点集,让PLINQ担心并行性。
示例:强>
// first get your map points, could be just a lazy iterator over every map point
IEnumerable<MapPoint> mapPoints = ...
//Now use PLINQ to compute in parallel, maintain order
var computedMapPoints = mapPoints.AsParallel()
.AsOrdered()
.Select(mappoint => ComputeMapPoint(mappoint)).ToList();
答案 1 :(得分:1)
这是我应该有效的解决方案,虽然我还没有测试过。 添加一个新字段:
private readonly Queue<AutoResetEvent> waitHandles = new Queue<AutoResetEvent>();
两个if(开始和结束)需要更改为:
开始:
if (bufferCount == BufferSize)
{
AutoResetEvent ev = new AutoResetEvent( false );
waitHandles.Enqueue( ev );
ev.WaitOne();
}
结束:
if (endBufferCount == BufferSize)
{
FlushBuffer();
for ( int i = 0; i < Math.Min( waitHandles.Count, BufferSize ); ++i )
{
waitHandles.Dequeue().Set();
}
}