Question

我的任务是我不知道文件中的单词数量，而且这些单词重复了几次，但是多少次 - 这是未知的，我必须找到这些单词。我使用类和向量来处理单词，使用fstream来处理文件。但我找不到找到重复单词的资源或算法，我很困惑。我有变量类型的向量，我把它推入其中。它运行成功，我用v.size（）输出测试它。除了找到重复单词的算法之外，我做了所有的事情，这对我来说很难解决。

我写的完整代码：

#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <algorithm>
#include <stdio.h>
#include <iterator>
using namespace std;
class Wording {
private:
    string word;
    vector <string> v;
public:

    Wording(string Alternateword, vector <string> Alternatev) {
        v = Alternatev;
        word = Alternateword;
    }
};
int main() {
    ifstream ifs("words.txt");
    ofstream ofs("wordresults.txt");
    string word;
    vector <string> v;
    Wording obj(word,v);
    while(ifs >> word) v.push_back(word);
    for(int i=0; i<v.size(); i++) {

        //waiting for algorithm
        //ofs << v[i] << endl;
    }
    return 0;
}

Answer 1

尝试使用哈希映射。如果您使用的是gnu c ++，那就是std :: hash_map。在C ++ 11中，您可以使用std :: unordered_map，它将为您提供相同的功能。否则，hash_map可以从Boost获得，也可能在其他地方。

这里的关键概念是hash_map＆lt; word，count＆gt;。

Answer 2

输入文件中的唯一单词是您想要的吗？如果是这样，你可以用set（unordered_set，如果你真的不需要它们进行排序）这样做：

std::set<std::string> words; //can be changed to unordered_set
std::copy(ifs, std::ifstream(), std::inserter(words, words.begin());
std::copy(words.begin(), words.end(), ostream_iterator<std::string>(ofs));

你也可以使用vector，但你必须对它进行排序，然后在其上使用unique。

我现在无法编译此代码，因此我的代码段中可能存在一些错误。

如果您想要的是文件中不同单词的出现次数，那么您将必须使用某种类型的地图，如已经建议的那样。当然使用矢量，对它进行排序然后计算连续的单词也是一种解决方案，但不会太清楚。

如何使用向量C ++在文件中查找重复的单词

2 个答案: