关于大数据的GZipStream

时间:2012-05-16 15:29:53

标签: c# gzipstream

我正在尝试压缩大量数据,有时在100GB的范围内,当我运行我编写的例程时,看起来文件的大小与之前的大小完全相同。有没有其他人在GZipStream上遇到过这个问题?

我的代码如下:

        byte[] buffer = BitConverter.GetBytes(StreamSize);
        FileStream LocalUnCompressedFS = File.OpenWrite(ldiFileName);
        LocalUnCompressedFS.Write(buffer, 0, buffer.Length);
        GZipStream LocalFS = new GZipStream(LocalUnCompressedFS, CompressionMode.Compress);
        buffer = new byte[WriteBlock];
        UInt64 WrittenBytes = 0;
        while (WrittenBytes + WriteBlock < StreamSize)
        {
            fromStream.Read(buffer, 0, (int)WriteBlock);
            LocalFS.Write(buffer, 0, (int)WriteBlock);
            WrittenBytes += WriteBlock;
            OnLDIFileProgress(WrittenBytes, StreamSize);
            if (Cancel)
                break;
        }
        if (!Cancel)
        {
            double bytesleft = StreamSize - WrittenBytes;
            fromStream.Read(buffer, 0, (int)bytesleft);
            LocalFS.Write(buffer, 0, (int)bytesleft);
            WrittenBytes += (uint)bytesleft;
            OnLDIFileProgress(WrittenBytes, StreamSize);
        }
        LocalFS.Close();
        fromStream.Close();

StreamSize是一个8字节的UInt64值,用于保存文件的大小。我将这8个字节原始写入文件的开头,所以我知道原始文件的大小。 Writeblock的值为32kb(32768字节)。 fromStream是从中获取数据的流,在本例中是FileStream。压缩数据前面的8个字节是否会引起问题?

2 个答案:

答案 0 :(得分:5)

我使用以下代码进行了测试以进行压缩,它在7GB和12GB文件上运行时都没有问题(事先都知道压缩“井”)。这个版本适合你吗?

const string toCompress = @"input.file";
var buffer = new byte[1024*1024*64];

using(var compressing = new GZipStream(File.OpenWrite(@"output.gz"), CompressionMode.Compress))
using(var file = File.OpenRead(toCompress))
{
    var bytesRead = 0;
    while(bytesRead < buffer.Length)
    {
        bytesRead = file.Read(buffer, 0, buffer.Length);
        compressing.Write(buffer, 0, buffer.Length);
    }
}

你签出了documentation吗?

  

GZipStream 类无法解压缩导致超过8 GB未压缩数据的数据。

您可能需要找到一个不同的库来支持您的需求,或者尝试将您的数据分成<=8GB块,这些块可以安全地“缝合”在一起。

答案 1 :(得分:0)

Austin Salonen的代码对我不起作用(错误,4GB错误)。

这是正确的方法:

using System;
using System.Collections.Generic;
using System.Text;

namespace CompressFile
{
    class Program
    {


        static void Main(string[] args)
        {
            string FileToCompress = @"D:\Program Files (x86)\msvc\wkhtmltopdf64\bin\wkhtmltox64.dll";
            FileToCompress = @"D:\Program Files (x86)\msvc\wkhtmltopdf32\bin\wkhtmltox32.dll";
            string CompressedFile = System.IO.Path.Combine(
                 System.IO.Path.GetDirectoryName(FileToCompress)
                ,System.IO.Path.GetFileName(FileToCompress) + ".gz"
            );


            CompressFile(FileToCompress, CompressedFile);
            // CompressFile_AllInOne(FileToCompress, CompressedFile);

            Console.WriteLine(Environment.NewLine);
            Console.WriteLine(" --- Press any key to continue --- ");
            Console.ReadKey();
        } // End Sub Main


        public static void CompressFile(string FileToCompress, string CompressedFile)
        {
            //byte[] buffer = new byte[1024 * 1024 * 64];
            byte[] buffer = new byte[1024 * 1024]; // 1MB

            using (System.IO.FileStream sourceFile = System.IO.File.OpenRead(FileToCompress))
            {

                using (System.IO.FileStream destinationFile = System.IO.File.Create(CompressedFile))
                {

                    using (System.IO.Compression.GZipStream output = new System.IO.Compression.GZipStream(destinationFile,
                        System.IO.Compression.CompressionMode.Compress))
                    {
                        int bytesRead = 0;
                        while (bytesRead < sourceFile.Length)
                        {
                            int ReadLength = sourceFile.Read(buffer, 0, buffer.Length);
                            output.Write(buffer, 0, ReadLength);
                            output.Flush();
                            bytesRead += ReadLength;
                        } // Whend

                        destinationFile.Flush();
                    } // End Using System.IO.Compression.GZipStream output

                    destinationFile.Close();
                } // End Using System.IO.FileStream destinationFile 

                // Close the files.
                sourceFile.Close();
            } // End Using System.IO.FileStream sourceFile

        } // End Sub CompressFile


        public static void CompressFile_AllInOne(string FileToCompress, string CompressedFile)
        {
            using (System.IO.FileStream sourceFile = System.IO.File.OpenRead(FileToCompress))
            {
                using (System.IO.FileStream destinationFile = System.IO.File.Create(CompressedFile))
                {

                    byte[] buffer = new byte[sourceFile.Length];
                    sourceFile.Read(buffer, 0, buffer.Length);

                    using (System.IO.Compression.GZipStream output = new System.IO.Compression.GZipStream(destinationFile,
                        System.IO.Compression.CompressionMode.Compress))
                    {
                        output.Write(buffer, 0, buffer.Length);
                        output.Flush();
                        destinationFile.Flush();
                    } // End Using System.IO.Compression.GZipStream output

                    // Close the files.        
                    destinationFile.Close();
                } // End Using System.IO.FileStream destinationFile 

                sourceFile.Close();
            } // End Using System.IO.FileStream sourceFile

        } // End Sub CompressFile


    } // End Class Program


} // End Namespace CompressFile