C#超出内存异常存储文件作为DataSet中的字节数组

时间:2014-04-07 16:11:00

标签: c# .net stream out-of-memory

我遇到了#34;内存不足的问题"我无法重现的例外情况,但每次运行单元测试的构建服务器都会出现。在我的机器上运行单元测试不会导致异常。所做的更改是因为原始代码在传入流中的大型PDF中存在奇怪的问题。如果你知道为什么原始代码与大型PDF有问题,或者为什么新代码会导致内存不足"例外,请告诉我。

原始代码:

// stream is a valid Stream and parentKey is a valid int
// Reset the stream position
stream.Position = 0;
int sequenceNumber = 0;
int StreamReadSize = short.MaxValue;
byte[] buffer = new byte[StreamReadSize]; 
MemoryStream outStream = null;
try
{
    long previousStreamPosition = 0;
    long DataBlockSize  = 52428800;
    int read;
    while ((read = stream.Read(buffer, 0, buffer.Length)) > 0)
    {
        if (outStream == null)
            outStream = new MemoryStream(new byte[System.Math.Min(stream.Length - previousStreamPosition, DataBlockSize)]);

        previousStreamPosition = stream.Position;
        outStream.Write(buffer, 0, read);
        if (outStream.Position <= (DataBlockSize - StreamReadSize) && stream.Position < stream.Length)
            continue;

        var dataRow = dataSet.Tables["table_name"].NewRow();
        dataRow["parent_key"] = parentKey;
        dataRow["key"] = FuncThatReturnsNextAvailableKey();
        dataRow["sequence_number"] = ++sequenceNumber;
        // Reset the position and Zip up the data
        outStream.Position = 0;

        dataRow["data_segment"] = FuncThatZipsAStreamToByteArray(outStream);

        dataSet.Tables["table_name"].Rows.Add(dataRow);

        outStream.Flush();
        outStream.Dispose();
        outStream = null;
    }
}
finally
{
    if (outStream != null)
        outStream.Dispose();
}

新守则:

// stream is a valid Stream and parentKey is a valid int
// Reset the stream position and create the variables needed for saving the file data
stream.Position = 0;
int sequenceNumber = 0;
int bytesRead;
int DataBlockSize = 52428800;
byte[] buffer = new byte[DataBlockSize];
while ((bytesRead = stream.Read(buffer, 0, DataBlockSize)) > 0)
{
    sequenceNumber++;

    // Create and initialize the row
    var dataRow = dataSet.Tables["table_name"].NewRow();
    dataRow["parent_key"] = parentKey;
    dataRow["key"] = FuncThatReturnsNextAvailableKey(); ;
    dataRow["sequence_number"] = sequenceNumber;

    // If the stream reads in less data than the size of the buffer then create an appropriately sized version of the buffer
    // that will only hold the data that was read in
    if (bytesRead != DataBlockSize)
    {
        var shrunkBuffer = new byte[bytesRead];
        Array.Copy(buffer, shrunkBuffer, bytesRead);
        using (var memoryStream = new MemoryStream(shrunkBuffer))
            dataRow["data_segment"] = FuncThatZipsAStreamToByteArray(memoryStream);
    }
    else
    {
        using (var memoryStream = new MemoryStream(buffer))
            dataRow["data_segment"] = FuncThatZipsAStreamToByteArray(memoryStream);
    }

    // Add the finished row
    dataSet.Tables["table_name"].Rows.Add(dataRow);
}

2 个答案:

答案 0 :(得分:2)

有意义的是,两个不同的环境可能会产生不同的结果。可能是您的构建服务器的内存少于您的个人编码环境。

可能是您通过以下方式将字节数组保存在内存中:

dataRow["data_segment"] = FuncThatZipsAStreamToByteArray(memoryStream);

您正在处理输出流,但我假设您的数据行保留在内存中,因此您将保留对该字节数组的引用。它可能会使多个PDF达到您的流程可以为自己分配的最大分配量。

答案 1 :(得分:0)

使用类内存分支

来自https://gist.github.com/bittercoder/3588074的来源

        using (System.IO.FileStream stream = new System.IO.FileStream(fileName, System.IO.FileMode.Open, System.IO.FileAccess.Read))
        {
            using (MemoryTributary memT = new MemoryTributary())
            {

                memT.ReadFrom(stream, stream.Length);
                return memT.ToArray();
            }
        }