创建文本文件中出现的前10个单词的直方图

时间:2013-12-03 02:03:45

标签: c++

我正在尝试创建文本中前十个单词出现的频率(百分比)直方图。 它需要采用与此类似的格式:

表示:-------------------- 20(%和20破折号)

是:--------------- 15

他:---------- 10

战争:----- 5

有:----- 5

英雄:----- 5

有:---- 4

女孩:----- 5

生气:---- 4

其他人:-------------------------------- 32

我知道我需要一个10的for循环,并在其中另一个循环,破折号转到百分比

int main() {

string file_name;
cout << "Enter the file name, the odyssey or pride and prejudice, that you would like to analyze: ";
getline(cin,file_name);

vector<string> words; //loading words into vector called words
if(file_name == "test")
{
    words = words_to_vector("test.txt"); //used a test files so it would take less time
}
else if(file_name == "pride and prejudice")
{
    words = words_to_vector("pride-prejudice.txt");
}
else if (file_name == "the odyssey")
{
    words = words_to_vector("the-odyssey.txt");
}
else
{
    cout << "Sorry, the file name you entered is invalid.";
}

vector<word_stats> stats = get_word_stats(words); // calculating the number of words, and how many of those words are unique

cout << endl << endl << "Total words: " << words.size() << ", " << stats.size() << " unique" << endl;

sort(stats.begin(), stats.end(), compare_stats);

cout << "Most common word is '" << stats[0].word << "' occuring " << stats[0].count << " times." << endl;


int topMostCount = 10; //where I need help putting this into a histogram
//int totalWords = words.size();
int totalUniqueWords = stats.size();
//most common
for (int i=0; i<stats.size() && i<topMostCount; i++) {
    cout << stats[i].word << ": " << stats[i].count << endl;
}

1 个答案:

答案 0 :(得分:0)

如何创建生成破折号的方法并将其作为const char *?

输出
for(int i = 0; i < stats.size() && i < topMostCount; i++){
    int percent = (double)stats[i].count / words.size * 100.00;
    const char * dashes = genDashes(percent);
    cout << stats[i].word << ": " << dashes << percent << "%";
}

genDashes功能如下:

const char * genDashes(int n)
{
    std::stringstream ss_dashes;
    for(int i = 0; i < n; i++){
        ss_dashes << "-";
    }
    std::string s_dashes = ss_dashes.str();
    return s_dashes.c_str();
}