我的下面的代码给了我来自字符串的大多数单词。我想从矢量中得到三个最常出现的单词及其计数值。有什么帮助吗?
我使用了vector
和unordered_map
。在代码的最后部分,我从vector
获得了最多的单词。
int main(int argc,char *argv[])
{
typedef std::unordered_map<std::string,int> occurrences;
occurrences s1;
std::string input = argv[1];
std::istringstream iss(std::move(input));
std::vector<std::string> most;
int max_count = 0,second=0,third=0;
//Here I get max_count, 2nd highest and 3rd highest count value
while (iss >> input)
{
int tmp = ++s1[input];
if (tmp == max_count)
{
most.push_back(input);
}
else if (tmp > max_count)
{
max_count = tmp;
most.clear();
most.push_back(input);
third = second;
second = max_count;
}
else if (tmp > second)
{
third = second;
second = tmp;
}
else if (tmp > third)
{
third = tmp;
}
}
//I have not used max_count, second, third below. I dont know how to access them for my purpose
//Print each word with it's occurenece. This works fine
for (occurrences::const_iterator it = s1.cbegin();it != s1.cend(); ++it)
std::cout << it->first << " : " << it->second << std::endl;;
//Prints word which occurs max time. **Here I want to print 1st highest,2nd highest,3rd highest occuring word with there occurrence. How to do?**
std::cout << std::endl << "Maximum Occurrences" << std::endl;
for (std::vector<std::string>::const_iterator it = most.cbegin(); it != most.cend(); ++it)
std::cout << *it << std::endl;
return 0;
}
有想法获得3个最常见的词吗?
答案 0 :(得分:3)
我更愿意使用std::map<std::string, int>
代替
将其用作源地图,插入std::vector<std::string>
现在创建multimap,一个翻译版本的源地图,std::greater<int>
作为比较器
这张最终地图的前三个值是最常用词
示例:
#include<iostream>
#include<algorithm>
#include<map>
#include<vector>
int main()
{
std::vector<std::string> most { "lion","tiger","kangaroo",
"donkey","lion","tiger",
"lion","donkey","tiger"
};
std::map<std::string, int> src;
for(auto x:most)
++src[x];
std::multimap<int,std::string,std::greater<int> > dst;
std::transform(src.begin(), src.end(), std::inserter(dst, dst.begin()),
[] (const std::pair<std::string,int> &p) {
return std::pair<int,std::string>(p.second, p.first);
}
);
std::multimap<int,std::string>::iterator it = dst.begin();
for(int count = 0;count<3 && it !=dst.end();++it,++count)
std::cout<<it->second<<":"<<it->first<<std::endl;
}
DEMO HERE
答案 1 :(得分:1)
使用堆来存储三个最常见的单词更容易,更清晰。它也可以很容易地扩展到大量最常出现的单词。
答案 2 :(得分:1)
如果我想知道n个最常出现的单词,我会有一个n元素数组,遍历单词列表,并将那些使它成为我的top n的数据存储到数组中(删除最低的单词数组) )。