Question

我的代码：

        string input1;

        input1 = Console.ReadLine();

        Console.WriteLine("byte output");

        byte[] bInput1 = Encoding.Unicode.GetBytes(input1);


        for (int x = 0; x < bInput1.Length; x++)
            Console.WriteLine("{0} = {1}", x, bInput1[x]);

输出：

104 0 101 0 108 0 108 0 111 0

输入“hello”

是否有对字符图的引用，我可以理解这一点？

Answer 1

你应该在http://www.joelonsoftware.com/articles/Unicode.html

上阅读“每个软件开发人员绝对最低，绝对必须知道Unicode和字符集（没有借口！）”

您可以在http://www.unicode.org找到所有Unicode字符的列表，但不希望能够在不了解文本编码问题的情况下阅读那些文件。

Answer 2

在http://www.unicode.org/charts/，您可以找到所有Unicode代码图表。 http://www.unicode.org/charts/PDF/U0000.pdf表示'h'的代码点是U + 0068。（另一个查看此数据的好工具是BabelMap。）

可以在http://unicode.org/faq/utf_bom.html#6和http://www.ietf.org/rfc/rfc2781.txt找到UTF-16编码的确切详细信息。简而言之，U + 0068被编码（UTF-16LE）为0x68 0x00。在十进制中，这是您看到的前两个字节：104 0.

其他字符的编码方式相似。

最后，除了Unicode Standard本身之外，一个很好的参考（当试图理解各种Unicode规范时）是Unicode Glossary。

有人可以为我解释Encoding.Unicode.GetBytes（“你好”）吗？

2 个答案: