我在逐行阅读文件时发现了一些奇怪的行为。如果文件以\n
(空行)结尾,则可能会跳过它...但并非总是如此,而且我看不到是什么原因导致它被跳过。
我写了这个小函数,将一个字符串分成几行,以便轻松地重现这个问题:
std::vector<std::string> SplitLines( const std::string& inputStr )
{
std::vector<std::string> lines;
std::stringstream str;
str << inputStr;
std::string sContent;
while ( std::getline( str, sContent ) )
{
lines.push_back( sContent );
}
return lines;
}
当我测试它时(http://cpp.sh/72dgw),我得到了那些输出:
(1) "a\nb" was splitted to 2 line(s):"a" "b"
(2) "a" was splitted to 1 line(s):"a"
(3) "" was splitted to 0 line(s):
(4) "\n" was splitted to 1 line(s):""
(5) "\n\n" was splitted to 2 line(s):"" ""
(6) "\nb\n" was splitted to 2 line(s):"" "b"
(7) "a\nb\n" was splitted to 2 line(s):"a" "b"
(8) "a\nb\n\n" was splitted to 3 line(s):"a" "b" ""
对于案例(6),(7)和(8),跳过最后\n
,很好。但是为什么它不适用于(4)和(5)呢?
这种行为背后的理性是什么?
答案 0 :(得分:2)
有一篇有趣的帖子,很快就提到了这种“奇怪”行为:getline() sets failbit and skips last line
由Rob's answer来管理,\n
是终结者(这就是为什么它的名字结束 Of Line),而不是分隔符,表示行被定义为“以'\ n''结尾”,而不是“由'\ n'分隔”。
我不清楚这是如何回答这个问题的,但实际上确实如此。重新制定如下,水变得清晰:
如果您的内容计算x
出现'\ n',那么您最终将获得x
行,或x+1
如果有一些额外的非'' n'文件末尾的字符。
(1) "a\nb" splitted to 2 line(s):"a" "b" (1 EOL + extra characters = 2 lines)
(2) "a" splitted to 1 line(s):"a" (0 EOL + extra characters = 1 line)
(3) "" splitted to 0 line(s): (0 EOL + no extra characters = 0 line)
(4) "\n" splitted to 1 line(s):"" (1 EOL + no extra characters = 1 line)
(5) "\n\n" splitted to 2 line(s):"" "" (2 EOL + no extra characters = 2 lines)
(6) "\nb\n" splitted to 2 line(s):"" "b" (2 EOL + no extra characters = 2 lines)
(7) "a\nb\n" splitted to 2 line(s):"a" "b" (2 EOL + no extra characters = 2 lines)
(8) "a\nb\n\n" splitted to 3 line(s):"a" "b" "" (3 EOL + no extra characters = 3 lines)