Question

我正在解析一个大文件，我想通过显示已读取的字节数来监视该过程。实际的代码很多，但这部分是我的计算方式。

StreamReader sr =  new StreamReader(FilePath);
        while ((line = sr.ReadLine()) != null )
        {
            //do parsing jobs

            byteCnt += Convert.ToUInt64( line.Length * sizeof(char) );
        }

 Console.WriteLine(String.Format("{0:n0}", byteCnt) + "  Bytes");

文件为16.9 GB（18,186,477,492字节）

但是我的程序需要34,816,805,164字节

这怎么可能发生？以及如何使这个数字更合理？

谢谢

Answer 1

sizeof(char)在C＃中为2，因为它使用unicode编码。如果您的文件不是unicode，这将不是一种准确的措施。您可以改用例如

System.Text.ASCIIEncoding.ASCII.GetByteCount(line);
// or another example:
Encoding.UTF8.GetByteCount(line);

获取尺寸。您需要根据文件的编码选择合适的解决方案。

计数已读取的实际字节

1 个答案: