我有一个很大的二进制文件,其中的各个部分需要从不同线程中同时读取(根本不写入),因此我尝试启动一项利用内存映射文件的测试。
我的所有测试均表明无法同时访问和读取内存映射文件区域。 n个工人读取数据所花费的时间呈线性增长。我在做错什么吗?
请在下面查看我的代码段,也许我错过了一些事情:
这里是创建内存映射文件并保留对句柄的引用的代码段,而其他线程/进程则访问内存映射文件区域:
public void CreateNewMemoryMappedFiles(List<TimeSeriesRequest> timeSeriesRequests)
{
foreach (var timeSeriesRequest in timeSeriesRequests)
{
var header = GetTimeSeriesHeader(timeSeriesRequest.PluginName, timeSeriesRequest.Symbol.SymbolId, timeSeriesRequest.QuoteType, timeSeriesRequest.Compression);
if (_memoryMappedFileObjects.ContainsKey(header.BinaryFileName))
continue;
var pathFileName = _databaseDirectory + "\\" + header.BinaryFileName;
//create memory mapped file
var fileStream = File.OpenRead(pathFileName);
var memoryMappedFile = MemoryMappedFile.CreateFromFile(fileStream, header.BinaryFileName, fileStream.Length, MemoryMappedFileAccess.Read, null, 0, false);
//create new memory mapped file object and add to collection
_memoryMappedFileObjects.Add(header.BinaryFileName, new MemoryMappedFileObject(header.BinaryFileName, memoryMappedFile));
}
}
并且在工作人员使用的摘录部分的下方(请注意,由于使用不同的线程/进程,这两个摘录显然不位于同一类实例中。每个工作人员都会获得自己的包含以下内容的类实例:如下):
/read binary data
BinaryReader reader;
if (activeBinaryReaders.ContainsKey(header.BinaryFileName))
{
reader = activeBinaryReaders[header.BinaryFileName];
}
else
{
var mappedFile = MemoryMappedFile.OpenExisting(header.BinaryFileName, MemoryMappedFileRights.Read, HandleInheritability.None);
var viewStream = mappedFile.CreateViewStream(0, 0, MemoryMappedFileAccess.Read);
reader = new BinaryReader(viewStream);
activeBinaryReaders.Add(header.BinaryFileName, reader);
}
//read bytes
byte[] bytes;
var indexPositionStart = BinarySearch(dateTimeTuple.Item1, reader, packetSize, true);
var indexPositionEnd = BinarySearch(dateTimeTuple.Item2, reader, packetSize, false);
//set starting point in filestream
reader.BaseStream.Seek(indexPositionStart, SeekOrigin.Begin);
//read and return byte array
bytes = reader.ReadBytes((int)(indexPositionEnd - indexPositionStart));
最后,该代码段模拟了多个尝试并发传输数据的工作程序:
class Program
{
private const int NumberWorkers = 3;
private static Stopwatch _watch;
static void Main(string[] args)
{
_watch = new Stopwatch();
var processor = new ActionBlock<string>(workerId =>
{
var tsdb = new TimeSeriesDatabase(@"X:\Algorithmic Trading Static Data\BinaryDatabase");
var dtFrom = new DateTime(2017, 1, 1, 0, 0, 0);
var dtTo = new DateTime(2017, 12, 31, 23, 59, 59);
var request = new TimeSeriesRequest("Me", "DukasCopy", new Symbol("USDJPY", AssetClass.Currency, Currency.JPY), QuoteType.BidAsk, TimeSpan.Zero, dtFrom, dtTo);
var watch = new Stopwatch();
var quoteCounter = 0;
tsdb.StreamDataMemoryMapped(new List<TimeSeriesRequest>() { request }, 2000000, dataPacket =>
{
if (dataPacket.Flag == HistoricalDataStreamFlag.StartUp)
{
watch.Start();
}
else if (dataPacket.Flag == HistoricalDataStreamFlag.Packet)
{
foreach (var quote in dataPacket.Quotes)
{
}
quoteCounter += dataPacket.Quotes.Count;
}
else if (dataPacket.Flag == HistoricalDataStreamFlag.Shutdown)
{
watch.Stop();
Console.WriteLine($"Worker {workerId} processed {quoteCounter} quotes and took {watch.ElapsedMilliseconds} milliseconds to process");
}
}, new CancellationToken());
}, new ExecutionDataflowBlockOptions() { MaxDegreeOfParallelism = NumberWorkers });
//create memory mapped file(s)
var outsideTsdb = new TimeSeriesDatabase(@"X:\Algorithmic Trading Static Data\BinaryDatabase");
var outsideDtFrom = new DateTime(2017, 1, 1, 0, 0, 0);
var outsideDtTo = new DateTime(2017, 12, 31, 23, 59, 59);
var outsideRequest = new TimeSeriesRequest("Me", "DukasCopy", new Symbol("USDJPY", AssetClass.Currency, Currency.JPY), QuoteType.BidAsk, TimeSpan.Zero, outsideDtFrom, outsideDtTo);
outsideTsdb.CreateNewMemoryMappedFiles(new List<TimeSeriesRequest>(){outsideRequest});
foreach (var workerIndex in Enumerable.Range(1, NumberWorkers))
{
processor.Post($"Worker {workerIndex}");
}
_watch.Start();
//mark completion
processor.Complete();
processor.Completion.Wait();
//print out results
_watch.Stop();
Console.WriteLine($"{NumberWorkers} workers took {_watch.ElapsedMilliseconds} milliseconds in total.");
Console.WriteLine("Press button to stop");
Console.ReadLine();
}
}
答案 0 :(得分:-1)
在这种过程中,好的做法是将文件的副本创建为临时文件。
示例:
// YOUR CODE
var pathFileName = _databaseDirectory + "\\" + header.BinaryFileName;
var tempPathFilename = "temporarypath" + "\\" + header.BinaryFileName;
在
tempPathFilename
以编程方式将原始文件复制到临时路径文件名目录中。
每当您要访问该文件时,请对该文件进行临时复制并处理该临时文件(而不是原始文件),以避免阻塞。