c# - 读取带有不规则换行符的文件

时间:2016-04-12 11:52:03

标签: c# file

我正在尝试使用C#读取文本文件,其格式如下:

this is a line\r\n
this is a line\r
\r\n
this is a line\r
\r\n
this is a line\r
\r\n
this is a line\r\n
this is a line\r
\r\n
etc...

我正在使用

从文件中读取每一行
StreamReader.ReadLine()

但这不会保留换行符。我需要知道/检测有哪种新行字符,因为我计算每行的字节数。例如:

如果该行以字符\r结尾,则行包含:((nr-of-bytes-in-line) + 1 byte)个字节(取决于当然的编码类型),如果行以\r\n结尾,则行包含: ((nr-of-bytes-in-line) + 2 bytes)个字节。

修改

我有解决方案,基于以色列祭坛的答案。顺便说一句: Jon Skeet 也提出了建议。我已经实现了一个重写版本的ReadLine,因此它将包含新的行字符。这是被覆盖函数的代码:

    public override String ReadLine()
    {
        StringBuilder sb = new StringBuilder();
        while (true)
        {
            int ch = Read();
            if (ch == -1)
            {
                break;
            }
            if (ch == '\r' || ch == '\n')
            {
                if (ch == '\r' && Peek() == '\n')
                {
                    sb.Append('\r');
                    sb.Append('\n');
                    Read();
                    break;
                }
                else if(ch == '\r' && Peek() == '\r')
                {
                    sb.Append('\r');
                    break;
                }
            }
            sb.Append((char)ch);
        }
        if (sb.Length > 0)
        {
            return sb.ToString();
        }
        return null;
    }

1 个答案:

答案 0 :(得分:1)

这是根据.net资源实现readline的方式:

// Reads a line. A line is defined as a sequence of characters followed by
        // a carriage return ('\r'), a line feed ('\n'), or a carriage return
        // immediately followed by a line feed. The resulting string does not
        // contain the terminating carriage return and/or line feed. The returned
        // value is null if the end of the input stream has been reached.
        //
        public virtual String ReadLine() 
        {
            StringBuilder sb = new StringBuilder();
            while (true) {
                int ch = Read();
                if (ch == -1) break;
                if (ch == '\r' || ch == '\n') 
                {
                    if (ch == '\r' && Peek() == '\n') Read();
                    return sb.ToString();
                }
                sb.Append((char)ch);
            }
            if (sb.Length > 0) return sb.ToString();
            return null;
        }

你可以看到你可以像这样添加一个if句子:

 if (ch == '\r') 
{
  //add the amount of bytes wanted
}
if  (ch == '\n')
{
  //add the amount of bytes wanted
}

或做你想做的任何操作。