Question

假设我有一个

的文本文件

today is today but
tomorrow is today tomorrow

然后使用地图我如何跟踪重复的单词？它重复在哪一行？到目前为止，我将文件中的每个字符串作为temp读入，并以下列方式存储：

    map<string,int> storage;

    int count = 1 // for the first line of the file

    if(infile.is_open()){
     while( !infile.eof() ){ 
      getline(in, line);
      istringstream my_string(line);
      while(my_string.good()){
         string temp;
         my_string >> temp;

    storage[temp] = count
    }
    count++;// so that every string read in the next line will be recorded as that line.
}
}
   map<string,int>::iterator m;
   for(int m = storage.begin(); m!= storage.end(); m++){
      out<<m->first<<": "<<"line "<<m->second<<endl;
}

现在输出只是

but: line 1
is: line 2
today: line 2
tomorrow: line 2

但是...... 它应该打印出来（没有重复的字符串）：

today : line 1 occurred 2 times, line 2 occurred 1 time.
is: line 1 occurred 1 time, line 2 occurred 1 time.
but: line 1 occurred 1 time.
tomorrow: line 2 occurred 2 times.

注意：字符串的顺序无关紧要。

任何帮助将不胜感激。感谢。

Answer 1

map使用唯一键存储（键，值）对。这意味着如果您多次分配给同一个键，则只会存储您指定的最后一个值。

听起来你想要做的不是将行存储为值，而是要存储另一行 - ＆gt;出现的地图。

所以你可以像这样制作你的地图：

typedef int LineNumber;
typedef int WordHits;
typedef map< LineNumber, WordHits> LineHitsMap;
typedef map< string, LineHitsMap > WordHitsMap;
WordHitsMap storage;

然后插入：

WordHitsMap::iterator wordIt = storage.find(temp);
if(wordIt != storage.end())
{
    LineHitsMap::iterator lineIt = (*wordIt).second.find(count);
    if(lineIt != (*wordIt).second.end())
    {
        (*lineIt).second++;
    }
    else
    {
        (*wordIt).second[count] = 1;
    }
}
else
{
    LineHitsMap lineHitsMap;
    lineHitsMap[count] = 1;
    storage[temp] = lineHitsMap;
}

Answer 2

当你只在那里存储1项信息时，你试图从集合中获取2项信息。

扩展当前实现的最简单方法是存储结构而不是int。

所以而不是：

storage[temp] = count

你会这样做：

storage[temp].linenumber = count;
storage[temp].wordcount++;

定义地图的地方：

struct worddata { int linenumber; int wordcount; };
std::map<string, worddata> storage;

使用以下方式打印结果：

out << m->first << ": " << "line " << m->second.linenumber << " count: " << m->second.wordcount << endl;

编辑：使用typedef作为定义，例如：

typedef MYMAP std::map<std::string, struct worddata>;
MYMAP storage;

然后MYMAP::iterator iter;

Answer 3

您的存储数据类型不足以存储您要报告的所有信息。您可以通过使用向量进行计数存储来实现目标，但是您必须进行大量的簿记以确保在未遇到单词时实际插入0并在新单词时创建具有正确大小的向量遇到了。这不是一项微不足道的任务。

您可以将计数部分切换为数字地图，首先是行，第二是计数...这会降低代码的复杂性，但不会是最有效的方法。

无论如何，你只能用std :: map

来做你需要做的事情

编辑：只是想到了一个更容易生成但更难报告的替代版本：std :: vector＆lt; std :: map＆lt; std :: string，unsigned int＆gt;取代。对于文件中的每个新行，您将生成一个新映射＆lt; string，int＆gt;并将其推到矢量上。您可以创建一个帮助器类型集＆lt; string＆gt;包含文件中出现的所有单词，以便在报告中使用。

除非我把所有垃圾都封装在一个类中，所以我可能会这样做，所以我只是做了类似的事情：

my_counter.word_appearance(word,line_no);

Answer 4

除了其他任何事情，你的循环都是错的。你应该永远循环eof或好标志，但是读取操作是否成功。你想要这样的东西：

while( getline(in, line) ){ 
      istringstream my_string(line);
      string temp;
      while(my_string >> temp ){
           // do something with temp
      }
}

需要帮助C ++使用地图来跟踪INPUT文件中的单词

4 个答案: