Question

我在逐行阅读文件时发现了一些奇怪的行为。如果文件以\n（空行）结尾，则可能会跳过它...但并非总是如此，而且我看不到是什么原因导致它被跳过。

我写了这个小函数，将一个字符串分成几行，以便轻松地重现这个问题：

std::vector<std::string> SplitLines( const std::string& inputStr )
{
    std::vector<std::string> lines;

    std::stringstream str;
    str << inputStr;

    std::string sContent;
    while ( std::getline( str, sContent ) )
    {
        lines.push_back( sContent );
    }

    return lines;
}

当我测试它时（http://cpp.sh/72dgw），我得到了那些输出：

(1) "a\nb"       was splitted to 2 line(s):"a" "b" 
(2) "a"          was splitted to 1 line(s):"a" 
(3) ""           was splitted to 0 line(s):
(4) "\n"         was splitted to 1 line(s):"" 
(5) "\n\n"       was splitted to 2 line(s):"" "" 
(6) "\nb\n"      was splitted to 2 line(s):"" "b" 
(7) "a\nb\n"     was splitted to 2 line(s):"a" "b" 
(8) "a\nb\n\n"   was splitted to 3 line(s):"a" "b" ""

对于案例（6），（7）和（8），跳过最后\n，很好。但是为什么它不适用于（4）和（5）呢？

这种行为背后的理性是什么？

Answer 1

有一篇有趣的帖子，很快就提到了这种“奇怪”行为：getline() sets failbit and skips last line

由Rob's answer来管理，\n是终结者（这就是为什么它的名字结束 Of Line），而不是分隔符，表示行被定义为“以'\ n''结尾”，而不是“由'\ n'分隔”。

我不清楚这是如何回答这个问题的，但实际上确实如此。重新制定如下，水变得清晰：

如果您的内容计算x出现'\ n'，那么您最终将获得x行，或x+1如果有一些额外的非'' n'文件末尾的字符。

(1) "a\nb"       splitted to 2 line(s):"a" "b"    (1 EOL + extra characters = 2 lines)
(2) "a"          splitted to 1 line(s):"a"        (0 EOL + extra characters = 1 line)
(3) ""           splitted to 0 line(s):           (0 EOL + no extra characters = 0 line)
(4) "\n"         splitted to 1 line(s):""         (1 EOL + no extra characters = 1 line) 
(5) "\n\n"       splitted to 2 line(s):"" ""      (2 EOL + no extra characters = 2 lines)
(6) "\nb\n"      splitted to 2 line(s):"" "b"     (2 EOL + no extra characters = 2 lines)
(7) "a\nb\n"     splitted to 2 line(s):"a" "b"    (2 EOL + no extra characters = 2 lines)
(8) "a\nb\n\n"   splitted to 3 line(s):"a" "b" "" (3 EOL + no extra characters = 3 lines)

std :: getline如何决定跳过最后一个空行？

1 个答案: