Question

我正在做c ++。如果我有像“逐字逐句”这样的字符串，那么每次ifstream从文本文件中获取值时，都可以有空格，单词和换行符。像

所以当字符串“eachWord”通过while循环时，它将具有以下

1st everyWord：“word” 第2个每个字：“\ 0” 第3个每个词：“by” 。。。。。

注意：你不知道你将拥有什么字符串，因为我将从随机文本文件中获取字符串

我可以用ifstream.get（）和一堆条件语句来做，但我只是想知道是否有更好的方法。

Answer 1

您可以在输入流上使用operator>>功能并禁用跳过空格字符。

示例代码

#include <iostream>
#include <sstream>

int main()
{
    std::stringstream ifs;
    ifs.unsetf(std::ios::skipws);
    ifs.str("word by word\n this is test");

    std::string word;
    char ws;
    bool isWord = false;

    // Keep reading words or if that fails clear the error state and read a white space character
    while ((isWord = static_cast<bool>(ifs >> word)) || (ifs.clear(), ifs >> ws))
    {
        std::cout << "word: '";
        if (isWord)
            std::cout << word;
        else
            std::cout << ws;
        std::cout << "'\n";
    }

    return 0;
}

示例输出

word: 'word'
word: ' '
word: 'by'
word: ' '
word: 'word'
word: '
'
word: ' '
word: 'this'
word: ' '
word: 'is'
word: ' '
word: 'test'

Live Example

Answer 2

您可以使用substr或find函数查找所需的空格或其他字符。

例如：

string str="We think in generalities, but we live in details.";
string str2 = str.substr (3,5);  // output --> "think"

请阅读这些页面，它将对您有所帮助：

http://www.cplusplus.com/reference/string/string/substr/

http://www.cplusplus.com/reference/string/string/find/

Answer 3

使用strtok。在我看来，没有必要建立一个围绕标记化的类，除非strtok没有为你提供你需要的东西。这是一个例子

char myString[] = "Word1 Word2"; 
char *p = strtok(myString, " ");

while (p) { 
    printf ("Token: %s\n", p);
    p=strtok(NULL, " ");
}

一些警告（可能不适合您的需要）。字符串在此过程中被“销毁”，这意味着EOS字符内嵌在分隔符中。正确使用可能需要您创建字符串的非const版本。您还可以在解析时更改分隔符列表。

Answer 4

基于orbitcowboy建议的完整c ++解决方案。

#include <vector>
#include <string>
#include <fstream>
#include <iostream>
#include <cstring>

int main()
{
    std::vector<std::string>    words;

    std::ifstream ifs("words.txt");
    if (ifs.is_open() == false)
    { std::cerr << "Couldn't open file..." << std::endl; return -1; }

    std::string string_to_split(
                 (std::istreambuf_iterator<char>(ifs))
                 , std::istreambuf_iterator<char>());

    char * cstr = new char [string_to_split.length()+1];
    std::strcpy (cstr, string_to_split.c_str());

    const char delimiters[]=" \t\r\n\v\f";
    char *p = strtok(cstr, delimiters);
    while (p) {
        words.push_back(p);
        words.push_back("");
        p = strtok(NULL, delimiters);
    }

    for (auto word : words)
    { std::cout << "word: " << word << std::endl; }
}

请注意，由于我创建string_to_split的方式，这对于大文件来说是非常有用的。

这在最后for (auto sentence : sentences)使用c ++ 11功能，如果你不用c ++ 11标志编译，则删除它。

我如何逐字阅读文本文件（单独的空格和新行）

4 个答案:

示例代码

示例输出