Question

我正在尝试用C ++中的文本文件计算相同的字符串/单词。

This is my text file
one two three two
test testing 123
1 2 3

这是我的主要计划

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main(int argc, const char** argv)
{
    int counter = 0;
    int ncounter = 0;
    string str;
    ifstream input(argv[1]);

    while (getline(input, str)) 
    {
        if(str.find("two") != string::npos){counter++;}
        if(str.find('\n') != string::npos){ncounter++;}

        cout << str << endl; //To show the content of the file
    }

    cout << endl;
    cout << "String Counter: " << counter << endl;
    cout << "'\\n' Counter: " << ncounter << endl;

    return 0;
}

我正在使用.find（）函数来查找字符串。当我插入一个不存在的单词时，它不计算在内。当我插入单词“two”时，它会计数，但只有一次。

为什么不算2次？

对于回车（或返回线; \ n），它不能算任何。那是为什么？

Answer 1

因为两个twos在同一行，而你只搜索一行子串你无法找到＆＃39; \ n＆＃39;因为getline函数读取的行是否包含＆＃39; \ n＆＃39;。

Answer 2

为什么不使用std::multiset到store the words？对你来说它是do the counting，并且可以在一行中读取文件：

#include <iostream>
#include <fstream>
#include <string>
#include <set>
#include <iterator>

int main(int argc, const char** argv)
{
    // Open the file
    std::ifstream input(argv[1]);

    // Read all the words into a set
    std::multiset<std::string> wordsList = 
        std::multiset<std::string>( std::istream_iterator<std::string>(input),
                                    std::istream_iterator<std::string>());

    // Iterate over every word
    for(auto word = wordsList.begin(); word != wordsList.end(); word=wordsList.upper_bound(*word))
        std::cout << *word << ": " << wordsList.count(*word) << std::endl;

    // Done
    system("pause");
    return 0;
}

请注意上一个for部分 - word=wordsList.upper_bound(*word)。从技术上讲，您可以将其切换为word++（实际上最好将其缩短为for(auto word: wordList）。它确保集合中的每个值只输出一次。

它还会列出单词本身，而不需要像现在这样在当前while循环中进行。

Answer 3

您最好的选择是阅读每一行，然后沿着空格进行标记，以便您可以单独检查每个单词。

我怀疑我们在这里谈论家庭作业，所以我最好的答案是引导你到std :: strtok的c ++参考：http://en.cppreference.com/w/cpp/string/byte/strtok

在C ++中计算文本文件中的相同字符串/单词

3 个答案: