Question

可能重复Converting byte array to string and back again in C#

我正在使用Huffman Coding对here

中的某些文本进行压缩和解压缩

其中的代码构建了一个霍夫曼树，用于编码和解码。当我直接使用代码时，所有都能正常工作。

对于我的情况，我需要获取压缩内容，存储它并在需要时解压缩。

编码器的输出和解码器的输入为BitArray。

当我尝试将此BitArray转换为String并返回BitArray并使用以下代码对其进行解码时，我得到了一个奇怪的答案。

Tree huffmanTree = new Tree(); huffmanTree.Build(input); string input = Console.ReadLine(); BitArray encoded = huffmanTree.Encode(input); // Print the bits Console.Write("Encoded Bits: "); foreach (bool bit in encoded) { Console.Write((bit ? 1 : 0) + ""); } Console.WriteLine(); // Convert the bit array to bytes Byte[] e = new Byte[(encoded.Length / 8 + (encoded.Length % 8 == 0 ? 0 : 1))]; encoded.CopyTo(e, 0); // Convert the bytes to string string output = Encoding.UTF8.GetString(e); // Convert string back to bytes e = new Byte[d.Length]; e = Encoding.UTF8.GetBytes(d); // Convert bytes back to bit array BitArray todecode = new BitArray(e); string decoded = huffmanTree.Decode(todecode); Console.WriteLine("Decoded: " + decoded); Console.ReadLine();

the tutorial的原始代码输出为：

我的代码的输出是：

我在哪里错了朋友？帮助我，提前谢谢。

Answer 1

您不能将任意字节填充到字符串中。这个概念尚未定义。转换发生在使用编码。

string output = Encoding.UTF8.GetString(e);

e此时只是二进制垃圾，不是 UTF8字符串。所以在它上面调用UTF8方法是没有意义的。

解决方案：不要转换和转换为字符串。这不是往返。你为什么一开始就这样做？如果需要字符串，请使用基于-64或base-85的可循环格式。

Answer 2

我很确定Encoding不会进行往返 - 也就是说你不能将任意字节序列编码为字符串，然后使用相同的Encoding来获取字节并且总是期望他们是一样的。

如果您希望能够从原始字节往返到字符串并返回到相同的原始字节，则需要使用base64编码，例如。

http://blogs.microsoft.co.il/blogs/mneiter/archive/2009/03/22/how-to-encoding-and-decoding-base64-strings-in-c.aspx

位数组到字符串并返回位数组

2 个答案: