我正在尝试尽可能多地压缩字符串。当压缩为Base64字符串并从Base64字符串解压缩时,我确实有能正常工作的代码。
public static string CompressString(string text)
{
byte[] buffer = Encoding.UTF8.GetBytes(text);
var memoryStream = new MemoryStream();
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
{
gZipStream.Write(buffer, 0, buffer.Length);
}
memoryStream.Position = 0;
var compressedData = new byte[memoryStream.Length];
memoryStream.Read(compressedData, 0, compressedData.Length);
var gZipBuffer = new byte[compressedData.Length + 4];
Buffer.BlockCopy(compressedData, 0, gZipBuffer, 4, compressedData.Length);
Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4);
return Convert.ToBase64String(gZipBuffer); // RETURNS AS BASE64
//return Encoding.UTF8.GetString(gZipBuffer); // RETURN AS UTF8 STRING
}
public static string DecompressString(string compressedText)
{
byte[] gZipBuffer = Convert.FromBase64String(compressedText); // BASE64 STRING TO BYTE ARRAY
//byte[] gZipBuffer = Encoding.UTF8.GetBytes(compressedText); // UTF8 STRING TO BYTE ARRAY
using (var memoryStream = new MemoryStream())
{
int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);
var buffer = new byte[dataLength];
memoryStream.Position = 0;
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
{
gZipStream.Read(buffer, 0, buffer.Length);
}
return Encoding.UTF8.GetString(buffer);
}
}
这很好。但是,如果我切换CompressString
以返回Encoding.UTF8.GetString(gZipBuffer)
而不是Convert.ToBase64String(gZipBuffer)
并更改DecompressString
以使用Encoding.UTF8.GetBytes(compressedText)
而不是Convert.FromBase64String(compressedText)
读入缓冲区在解压缩时出现异常(尽管压缩工作正常)。
Additional information: The magic number in GZip header is not correct. Make sure you are passing in a GZip stream.
使用Base64的问题是,最终压缩的字符串比使用Encoding.UTF8.GetString
和Encoding.UTF8.GetBytes
的字符串长40%
有什么方法可以压缩字符串,而不必对结果字符串进行base64编码?