Question

在StreamReader StreamReader defaults to UTF-8 encoding unless specified otherwise中，它说：

StreamReader sr = new StreamReader(new FileStream("D:\\1.txt", FileMode.Open, FileAccess.Read));
string str = sr.ReadToEnd();
Console.WriteLine(str);
sr.Close();

这是否意味着当我读取文件时，它会将此文件视为UTF-8编码？或者它是否意味着其他东西，因为我已经测试过读取UTF-16LE编码文件并且它没有问题。

{{1}}

Answer 1

也许，知道答案的简单方法是执行一些测试：

internal static class Program
{
    private static void Main()
    {
        var bytes1 = new byte[] {0x00, 0x61, 0x25, 0x54};
        var bytes2 = new byte[] {0xFE, 0xFF, 0x00, 0x61, 0x25, 0x54};
        var bytes3 = new byte[] {0xFF, 0xFE, 0x61, 0x00, 0x54, 0x25};

        Write(bytes1); // Writes: ' a%T'
        Write(bytes2); // Writes: 'a╔'
        Write(bytes3); // Writes: 'a╔'

        Console.ReadKey();
    }

    private static void Write(byte[] bytes)
    {
        using (var ms = new MemoryStream(bytes))
        {
            using (var sr = new StreamReader(ms))
            {
                var str = sr.ReadToEnd();
                Console.WriteLine(str);
            }
        }
    }
}

因此，如果流的前2个字节是 ~~UTF-16~~ Unicode（LE或BE）的字节顺序掩码（BOM），则该流将被读取为 ~~UTF-16~~ Unicode流。否则它将被读作UTF-8。

<强> [编辑]

奇怪的是StreamReader Constructor (Stream, Encoding)包含StreamReader Constructor (Stream)没有的信息。

StreamReader对象尝试通过查看来检测编码流的前三个字节。它会自动识别 UTF-8，little-endian Unicode和big-endian Unicode文本（如果是文件）以适当的字节顺序标记开始。否则，使用用户提供的编码。

首先请注意：不一定使用用户提供的编码。

现在，如果您查看reference implementation，只有Stream作为参数的构造函数实际上是对：

的调用

StreamReader(stream: stream, encoding: Encoding.UTF8, detectEncodingFromByteOrderMarks: true, bufferSize: DefaultBufferSize, leaveOpen: false)

所以上面的信息适用。

更准确地说，它是this one：

如果detectEncodingFromByteOrderMarks参数为true，则为构造函数通过查看前三个字节来检测编码的流。它会自动识别UTF-8，little-endian 如果文件以。开头，则为Unicode和big-endian Unicode文本适当的字节顺序标记。否则，用户提供的编码是使用

StreamReader可以默认读取所有编码吗？

1 个答案: