Question

从SQL Server检索大型blob数据时，我遇到Out of Memory异常。我正在调用一个存储过程，它返回6列简单数据和1 varbinary(max)数据列。

我正在使用此代码执行存储过程：

m_DataReader = cmd.ExecuteReader(CommandBehavior.SequentialAccess);

我确保按顺序列顺序读取数据读取器中的列。

请参阅MSDN article on retrieving large data

对于varbinary(max)列，我正在读取数据：

DocBytes = m_DataReader.GetValue(i) as byte[];

我注意到的是，在Out of Memory中，我似乎在内存中有2个字节数组的副本。一个位于DocBytes数组中，另一个位于SqlDataReader的内部缓冲区中。

为什么有副本？我以为我会传递一个引用，或者这是由于SqlDataReader提供数据的内部方式 - 即它总是提供一个副本？

是否有更高效的内存从数据库中读取数据？

我已经查看了新的.NET 4.5 GetStream方法，但不幸的是，我没有能够传递流 - 我需要内存中的字节 - 所以我不能按照其他流式传输的例子文件或网络响应。但我想尝试确保一次只有一个副本存在于内存中！

我得出结论，这可能只是它必须的方式，并且副本只是一个尚未被垃圾收集的缓冲区。我真的不想在强迫垃圾收集方面搞砸，我希望有人对替代方法有一些想法。

Answer 1

我查看了新的.NET 4.5 GetStream方法，但不幸的是，我没有能力传递流 - 我需要字节存储器

所以你要做的就是从这个流中读取一个字节数组。

或者您也可以尝试使用GetBytes方法从阅读器中读取小块，如下所示：https://stackoverflow.com/a/625485/29407

Answer 2

从SQL检索二进制数据时可以选择。假设您使用varbinary（图像被描述）作为您的数据类型，您可以返回所有数据，也可以使用简单的子字符串函数返回一些数据。如果二进制文件很大（如1 gb），则返回所有数据将非常耗费内存。

如果是这种情况，您可以选择采用更迭代的方法来返回数据。假设它是一个1 GB的二进制文件，你可以让程序循环遍历100mb的数据块，将每个块写入磁盘，然后丢弃缓冲区，然后返回下一个100mb块。

要获得您使用的第一个块：

Declare @ChunkCounter as integer
Declare @Data as varbinary(max)
Declare @ChunkSize as integer = 10000000
Declare @bytes as integer
Select @bytes = datalength(YourField) from YourTable where ID = YourID
If @bytes> @ChunkSize 
      Begin 
           /* use substring to get the first chunksize   */ 
           Select @data= substring(YourField,0,@ChunkSize), @Chunkcounter +1 as 'ChunkCounter'
           FROM YourTable   
           where ID = YourID
      End 
Else
      Begin ....

Answer 3

您知道数据的长度吗？在这种情况下，您可以使用流式处理方法将数据复制到完美大小的byte[]。这将摆脱在非流媒体案例中ADO.NET内部似乎发生的双重缓冲。

Answer 4

DocBytes = m_DataReader.GetValue（i）as byte [];

这将创建一个大小为DATA_LENGTH（column_name）的缓冲区然后将完整复制到您的MemoryStream。
当DATA_LENGTH（column_name）是一个较大的值时，这很糟糕。
您需要通过缓冲区将其复制到内存流。

此外，如果您的文件太大，请将其写入临时文件，而不是将其完整地存储在MemoryStream中。

我就是这样做的

    // http://stackoverflow.com/questions/2885335/clr-sql-assembly-get-the-bytestream
    // http://stackoverflow.com/questions/891617/how-to-read-a-image-by-idatareader
    // http://stackoverflow.com/questions/4103406/extracting-a-net-assembly-from-sql-server-2005
    public static void RetrieveFileStream(System.Data.IDbCommand cmd, string columnName, string path)
    {
        using (System.Data.IDataReader reader = cmd.ExecuteReader(System.Data.CommandBehavior.SequentialAccess | System.Data.CommandBehavior.CloseConnection))
        {
            bool hasRows = reader.Read();
            if (hasRows)
            {
                const int BUFFER_SIZE = 1024 * 1024 * 10; // 10 MB
                byte[] buffer = new byte[BUFFER_SIZE];

                int col = string.IsNullOrEmpty(columnName) ? 0 : reader.GetOrdinal(columnName);
                int bytesRead = 0;
                int offset = 0;

                // Write the byte stream out to disk
                using (System.IO.FileStream bytestream = new System.IO.FileStream(path, System.IO.FileMode.Create, System.IO.FileAccess.Write, System.IO.FileShare.None))
                {
                    while ((bytesRead = (int)reader.GetBytes(col, offset, buffer, 0, BUFFER_SIZE)) > 0)
                    {
                        bytestream.Write(buffer, 0, bytesRead);
                        offset += bytesRead;
                    } // Whend

                    bytestream.Close();
                } // End Using bytestream 

            } // End if (!hasRows)

            reader.Close();
        } // End Using reader

    } // End Function RetrieveFile

使用此代码写入memoryStream很简单。
也许您需要使缓冲区大小更小或更大。

    public static System.IO.MemoryStream RetrieveMemoryStream(System.Data.IDbCommand cmd, string columnName, string path)
    {
        System.IO.MemoryStream ms = new System.IO.MemoryStream();

        using (System.Data.IDataReader reader = cmd.ExecuteReader(System.Data.CommandBehavior.SequentialAccess | System.Data.CommandBehavior.CloseConnection))
        {
            bool hasRows = reader.Read();
            if (hasRows)
            {
                const int BUFFER_SIZE = 1024 * 1024 * 10; // 10 MB
                byte[] buffer = new byte[BUFFER_SIZE];

                int col = string.IsNullOrEmpty(columnName) ? 0 : reader.GetOrdinal(columnName);
                int bytesRead = 0;
                int offset = 0;

                // Write the byte stream out to disk
                while ((bytesRead = (int)reader.GetBytes(col, offset, buffer, 0, BUFFER_SIZE)) > 0)
                {
                    ms.Write(buffer, 0, bytesRead);
                    offset += bytesRead;
                } // Whend

            } // End if (!hasRows)

            reader.Close();
        } // End Using reader

        return ms;
    } // End Function RetrieveFile

如果需要将其放入Response.OutputStream，请考虑直接将其写入，而不是通过MemoryStream.ToArray（）+ WriteBytes。

Answer 5

问题是DbDataReader.GetStream()创建了一个MemoryStream并使用该字段的数据填充此流。为了避免这种情况，我创建了一个扩展方法：

public static class DataReaderExtensions
{
    /// <summary>
    /// writes the content of the field into a stream
    /// </summary>
    /// <param name="reader"></param>
    /// <param name="ordinal"></param>
    /// <param name="stream"></param>
    /// <returns>number of written bytes</returns>
    public static long WriteToStream(this IDataReader reader, int ordinal, Stream stream)
    {
        if (stream == null)
            throw new ArgumentNullException("stream");

        if (reader.IsDBNull(ordinal))
            return 0;

        long num = 0L;
        byte[] array = new byte[8192];
        long bytes;
        do
        {
            bytes = reader.GetBytes(ordinal, num, array, 0, array.Length);
            stream.Write(array, 0, (int)bytes);
            num += bytes;
        }
        while (bytes > 0L);
        return num;
    }

    /// <summary>
    /// writes the content of the field into a stream
    /// </summary>
    /// <param name="reader"></param>
    /// <param name="field"></param>
    /// <param name="stream"></param>
    /// <returns>number of written bytes</returns>
    public static long WriteToStream(this IDataReader reader, string field, Stream stream)
    {
        int ordinal = reader.GetOrdinal(field);
        return WriteToStream(reader, ordinal, stream);
    }
}

从SQL Server检索blob数据的大多数内存有效方法

5 个答案: