C ++我们如何从文本文件中找到单词的每一行?

时间:2014-03-31 19:38:45

标签: c++ text-files

我正在逐字逐句地阅读文本文件,我正在尝试查找该单词所在行的编号。例如,如果我有以下内容:

Dog Cat
Car 
Truck

在第一行找到狗,在第一行找到Cat,在第二行找到Cat,在第3行找到卡车。

我有以下代码:

int main(){
string word;
ifstream inFile;
Node* rootPtr = NULL; // Pointer to the root node

inFile.open("example.txt");
if (!inFile)
    cout << "Unable to open text file";

while (inFile >> word) {
    if (word == "#")
        break;

    /* THIS DOES NOT WORK! Most likely because my text file doesn't contain /n but this is the
    kind of logic I am looking for
    else if (word == "/n"){
        counter++;
        cout << counter;
    }
    */

    else{
    rootPtr = Insert(rootPtr,word.substr(0,10));
    }
}
inOrderPrint(rootPtr);
inFile.close();
}

你可以忽略与指针有关的任何事情。这是其他一些东西。我已经尝试弄清楚如何检查线的末尾并创建一个计数器,每次都会增加一条线,但是我没有成功。

感谢您的帮助!

5 个答案:

答案 0 :(得分:0)

您始终可以谨慎使用getlinehttp://www.cplusplus.com/reference/string/string/getline/)并自行计算。当然,将读取的字符串拆分为以下行中的单词:Split a string in C++?

答案 1 :(得分:0)

您可以使用getline函数

string line;
int lineNum = 0; // Or 1 
while(getline(infile, line))
{
    i++;
}

如果你想逐行分割,你可以使用stringstream。

#include <sstream>
// Your code
 while(getline(infile, line))
{
    stringstream ssLine(line);
    string substr;
    while(ssLine)
    {
         ssLine >> substr;
         // substr will now hold each word (words should be separated by spaces)
    }
    i++;
}

或者更好,我有我的分割版本,欢迎您使用

/**
 * Equivalent to java's string.split() function.
 * 
 * @param toPopulate The return value of this function.
 * @param s          The string we want to split.
 * @param delim      The delim which we want to split. This will not be included in
 *                   the splitted string. User should pass only one character to this
 *                   string.
 */
void split(vector<string> &toPopulate, string s, string delim)
{
// Will hold the start of the substring (after the delim). Initially the 
// substring will start at 0.
int substrStart = 0;
while (substrStart < s.length())
{
    // Will hold the position of the delim.
    int curFoundPos = s.find(delim, substrStart);
    // Holds the current substring.
    string oneOfSplittedStr;

    // The delim not found. So, take the substring from previous delim to end.
    if (curFoundPos == -1)
    {
        oneOfSplittedStr = 
            s.substr(substrStart, s.length() - substrStart);
        // To break off the loop. If not for this stmt, we will go into infinite loop.
        substrStart = s.length();
    }
    else
    {
        oneOfSplittedStr = 
            s.substr(substrStart, curFoundPos - substrStart);
        // our next substring will start one greater than the current found position.
        substrStart = curFoundPos + 1;
    }
    // Empty - Nah
    if (!oneOfSplittedStr.empty() && oneOfSplittedStr.compare("") != 0)
        toPopulate.push_back(oneOfSplittedStr);
}

}

您可以随时使用boost's split

答案 2 :(得分:0)

你可以尝试一些事情。地图可行:

#include <map>
map <int,string> words;

然后添加单词:

int wordNum = 0;
while (inFile >> word) {
    if (word == "#")
        break;
    else{
        words[wordNum] = word;
    }

回想起你的话:

int x = 0;
while ( x < map.size() ) {
    cout << words[x] << " ";
    x++;
}

另一个选择是将单词存储到包含两个字符串或任意两个字符串的结构中 相应的数据类型(一个用于您的单词,一个用于#):

struct words_struct{
    string words;
    string wordNum;
} store ;  // single instance

并存储如下字样:

int x=0;
while (inFile >> word) {
    if (word == "#")
        break;
    else{
        store->words.append(word);
        store->wordNum.append(x);
        x++;
    }

以上代码需要一些修复&#39; (单词之间没有空格,int-&gt; string等)但要点是正确的。希望这可以帮助!祝你好运!

答案 3 :(得分:0)

您的要求存在问题:如果单词存在多行,该怎么办?

对于简单的情况,我建议保留一个行计数器并使用std::map<string, unsigned int>,其中 string 是你的单词而 unsigned int 是它的第一行号发生在。

要处理单词出现的所有行号,您可能需要使用std::map<string, std::vector<unsigned int> >,其中std::vector包含单词出现的所有行号。

示例:

typedef std::map<std::string, unsigned int> Word_Ref_Container;
Word_Ref_Container word_line_reference;
//...
std::string text_line;
unsigned int line_number = 1;
while (getline(input_file, text_line)
{
  std::istringstream text_stream(text_line);
  std::string  word;
  while (text_stream >> word)
  {
     if (word_line_reference.find(word) != word_line_reference.end())
     {
        word_line_reference[word] = line_number;
     }
  }
  ++line_number;
}

答案 4 :(得分:0)

我找到了一个很好的方法来做这件事。我所要做的就是添加以下内容:

    char c;
    while (inFile >> word) {
        c = inFile.get();
        else if (c=='\n'){
            rootPtr = Insert(rootPtr,word.substr(0,10));
            counter++;
        }
    }

感谢您的所有建议!