Question

我正在尝试将字符串转换为字节，反之亦然..我已经看到了在此网站上将字符串转换为字节数组的上一个问题..但我的问题是别的

这是我的代码

byte[] btest = new byte[2];
btest[0] = 0xFF;
btest[1] = 0xAA;
UTF8Encoding enc = new UTF8Encoding();
string str = enc.GetString(btest); //here i get a string with values str = '��'

//I had a byte array of size 2 with the above contents
//Here i am trying to convert the string to byte array
byte [] bst = enc.GetBytes(str); //On this step i get a byte array of size 6 
//and bst array contents as {239,191,189,239,191,189}

//In this step i try to convert the value back to btest array by taking the index
btest[0] = Convert.ToByte(str[0]); //on this line i get an exception
//Exception : Value was either too large or too small for an unsigned byte.
btest[1] = Convert.ToByte(str[1]);

GetBytes不应该返回一个大小为2的字节数组，我在做什么错误？我希望bst [0]包含我分配给btest [0]的相同值。

由于

Answer 1

您的原始字节输入无效UTF-8（请参阅here），因为它不代表任何unicode代码点。结果，无效数据被转换为。最后，这是一个像任何其他字符一样的字符，因此如果您尝试将其转换回字节，它将不会生成您的初始错误字节序列，而是正确的字节序列来表示该unicode代码点（两次）。 / p>

该字符不能表示为单个字节，因此Convert.ToByte会抛出OverflowException。

如果要将原始输入更改为有效的字节序列，请说：

btest[0] = 0xDF;
btest[1] = 0xBF;

您将看到enc.GetBytes(str)调用实际上再次产生一个双字节数组。

Answer 2

代码点0xFF 0xAA的字符在UTF-8编码中无效，因此转换为�

参考文献：

在相应的维基百科页面上查看有效的代码点范围：http://en.wikipedia.org/wiki/UTF-8#Description

String转换为Bytes转换的编码问题

2 个答案: