Question

我有一个字符串表示为字节数组，我知道已删除了0x00的高位字节，因此该字符串被压缩为：

0x43 0x6F 0x6D 0x6D 0x61 0x6E 0x64 //"Command"

如何将字节转换为Unicode字符串？

我猜我需要将字节复制到一个大小两倍的新数组（uncompressedBytes），每隔一个时间间隔：

byte[] compressedBytes = br.ReadBytes(stringLength);
byte[] uncompressedBytes = new byte[stringLength * 2];
for (int byteCounter = 0; byteCounter < stringLength; byteCounter++)
{
    uncompressedBytes[byteCounter * 2] = compressedBytes[byteCounter];
}
return Encoding.Unicode.GetString(uncompressedBytes);

或者是否存在将所有字节视为缺少高位字节的Unicode字符的编码？

Answer 1

前256个代码点与ISO-8859-1的内容相同，以使转换现有西方文本变得微不足道。

https://en.m.wikipedia.org/wiki/Unicode

Encoding.GetEncoding("ISO-8859-1").GetString(bytes)

Answer 2

如果您知道所有字节都是0x7f或更少，则可以将它们视为utf-8并使用System.Text.UTF8Encoding转换器类。

如何将压缩的（高位字节删除）字节转换为Unicode字符串？

2 个答案: