将文件中的单词放入哈希映射(c ++)

时间:2015-06-04 04:14:12

标签: c++ dictionary file-io while-loop hashmap

所以,我有一个很长的文本文件(10k +单词),我试图使用标准的地图库将每个独特的单词放入哈希映射。

我有一个while循环读取文件中的每个单词。问题是,这个循环似乎永远不会结束。我甚至在循环中放了一个if语句,这样如果它到达了eof(),它就会打破循环。它仍然没有结束。这是我到目前为止的代码:

#include <iostream>
#include <map>
#include <string>
#include <fstream>
#include <cctype>
using namespace std;


string lowerCase(string isUpper);

void main()
{
//create hash map
map<string, int> stringCounts;

//temp string
string nextString;

//import file/write file
ofstream writeFile;
ifstream gooseFile;

//open file to read from
gooseFile.open("goose.txt");
if (gooseFile.is_open()) {
    //read file word by word
    while (gooseFile >> nextString) { //WORKS DO NOT CHANGE
        //check for punctuation
        for (int i = 0; i < nextString.length(); i++) { //WORKS DO NOT CHANGE
            if (nextString[i] == ',' || nextString[i] == '!' || nextString[i] == ';' || nextString[i] == '-' || nextString[i] == '.' || nextString[i] == '?' || nextString[i] == ':' || nextString[i] == '"' || nextString[i] == '(' || nextString[i] == ')' || nextString[i] == '_' || nextString[i] == '\'') {
                nextString.erase(i, i);
                i--;
            }
        }
        //put all into lowercase
        nextString = lowerCase(nextString); //WORKS DO NOT CHANGE
        //cout << nextString << endl;

        //increment key value
        stringCounts[nextString]++;

        if (gooseFile.eof())
            break;
    }
}

//close current file
gooseFile.close();
cout << "I GOT HERE!";
//now print to an output file
writeFile.open("output.txt");
if (writeFile.is_open()) {
    cout << "ITS OPEN AGAIN";
    //write size of map
    writeFile << "The size of the hash map is " << stringCounts.size() << endl;
    //write all words in map
    //create iterator
    map<string, int>::iterator i = stringCounts.begin();
    //iterate through map 
    while (i != stringCounts.end()) {
        writeFile << "The key and value is : (" << i->first << "," << i->second << ")\n";
        i++;
    }
}
else
    cout << "CANT OPEN\n";
}


string lowerCase(string isUpper)
{
    string toReplace = isUpper;
    for (int i = 0; i < toReplace.length(); i++) {
        if (toReplace[i] >= 65 && toReplace[i] <= 90) {
            toReplace[i] = tolower(toReplace[i]);
        }
    }
    return toReplace;
}

1 个答案:

答案 0 :(得分:3)

nextString.erase(i, i);

我怀疑这是你想要的。 string::erase(你正在调用的那个)期望一个位置(从哪里开始擦除)和一个计数(要删除多少个字符)。因此,此行会删除与字符串中字符位置相当的多个字符。因此,例如,如果i为0,则将删除0个字符。将这一事实与下一行结合起来:

i--;

如果第一个字符是标点符号,i将保持为0,for循环将永远不会结束。如果你只想删除1个字符,你可以这样做:

nextString.erase(i, 1);

但是更换整个for循环并使用删除/擦除习惯用法会好得多。

auto new_end = std::remove_if(nextString.begin(), nextString.end(),
        [](char c) {
            // return true if c is punctuation
        });
nextString.erase(new_end, nextString.end());