Question

好吧，我正在尝试以字节为单位转换大字节信息。（11076长）

问题最终，信息缺少字符。（长度10996）

查找

enter image description here

Winsock连接收到信息，查看过程：

    public static void UpdateClient(UserConnection client)
    {
        string data = null;
        Decoder utf8Decoder = Encoding.UTF8.GetDecoder();

            Console.WriteLine("Iniciando");
            byte[] buffer = ReadFully(client.TCPClient.GetStream(), 0);
            int charCount = utf8Decoder.GetCharCount(buffer, 0, buffer.Length);
            Char[] chars = new Char[charCount];
            int charsDecodedCount = utf8Decoder.GetChars(buffer, 0, buffer.Length, chars, 0);

            foreach (Char c in chars)
            {
                data = data + String.Format("{0}", c);
            }

            int buffersize = buffer.Length;
            Console.WriteLine("Chars is: " + chars.Length);
            Console.WriteLine("Data is: " + data);
            Console.WriteLine("Byte is: " + buffer.Length);
            Console.WriteLine("Size is: " + data.Length);
            Server.Network.ReceiveData.SelectPacket(client.Index, data);
    }

    public static byte[] ReadFully(Stream stream, int initialLength)
    {
        if (initialLength < 1)
        {
            initialLength = 32768;
        }

        byte[] buffer = new byte[initialLength];
        int read = 0;

        int chunk;

        chunk = stream.Read(buffer, read, buffer.Length - read);

        checkreach:
            read += chunk;

            if (read == buffer.Length)
            {
                int nextByte = stream.ReadByte();

                if (nextByte == -1)
                {
                    return buffer;
                }

                byte[] newBuffer = new byte[buffer.Length * 2];
                Array.Copy(buffer, newBuffer, buffer.Length);
                newBuffer[read] = (byte)nextByte;
                buffer = newBuffer;
                read++;
                goto checkreach;
            }

        byte[] ret = new byte[read];
        Array.Copy(buffer, ret, read);
        return ret;
    }

任何人都有提示或解决方案吗？

Answer 1

UTF-8编码文本比字符数更多的字节是完全正常的。在UTF-8中，一些字符（例如á和ã）被编码为两个或更多字节。

如果您尝试使用它来读取超过初始缓冲区的值，或者如果它无法通过一次ReadFully调用读取整个流，那么Read方法会返回垃圾不应该使用它。 char数组转换为字符串的方式也非常慢。只需使用StreamReader读取流并将其解码为字符串：

public static void UpdateClient(UserConnection client) {
  string data;
  using (StreamReader reader = new StreamReader(client.TCPClient.GetStream(), Encoding.UTF8)) {
    data = reader.ReadToEnd();
  }
  Console.WriteLine("Data is: " + data);
  Console.WriteLine("Size is: " + data.Length);
  Server.Network.ReceiveData.SelectPacket(client.Index, data);
}

UTF8 Byte to String＆amp; Winsock GetStream

1 个答案: