我尝试读取一个字典文件,其中每行包含由空格分隔的字ID,字和频率。问题是用于存储单词的地图具有相同的值。如果你能帮助我,我非常感激。
typedef struct{
int id;
int count;
char* word;
} WORD;
//read file
std::map<int, WORD*> readWordMap(char* file_name)
{
std::ifstream infile(file_name, std::ifstream::in);
std::cout<<"word map read file:"<<file_name<<std::endl;
if (! infile) {
std::cerr<<"oops! unable to open file "<<file_name<<std::endl;
exit(-1);
}
std::map<int, WORD*> map;
std::vector<std::string> tokens;
std::string line;
char word[100];
int size;
while (std::getline(infile, line)) {
size = (int)split(line, tokens, ' ');
WORD* entry = (WORD*) malloc(sizeof(WORD*));
entry->id = atoi(tokens[0].c_str());
entry->count = atoi(tokens[2].c_str());
strcpy(word, tokens[1].c_str());
entry->word = word;
map[entry->id] = entry;
std::cout<< entry->id<<" "<<entry->word<<" "<<entry->count<<std::endl;
}
infile.close();
std::cout<<map.size()<<std::endl;
std::map<int, WORD*>::const_iterator it;
for (it = map.begin(); it != map.end(); it++) {
std::cout<<(it->first)<<" "<<(it->second->word)<<std::endl;
}
return map;
}
//split string by a delimiter
size_t split(const std::string &txt, std::vector<std::string> &strs, char ch)
{
size_t pos = txt.find( ch );
size_t initialPos = 0;
strs.clear();
while( pos != std::string::npos ) {
strs.push_back( txt.substr( initialPos, pos - initialPos + 1 ) );
initialPos = pos + 1;
pos = txt.find( ch, initialPos );
}
strs.push_back( txt.substr( initialPos, std::min( pos, txt.size() ) - initialPos + 1 ) );
return strs.size();
}
数据文件:
2 I 1
3 gave 1
4 him 1
5 the 3
6 book 3
7 . 3
8 He 2
9 read 1
10 loved 1
结果:
2 I 1
3 gave 1
4 him 1
5 the 3
6 book 3
7 . 3
8 He 2
9 read 1
10 loved 1
map size:9
2 loved
3 loved
4 loved
5 loved
6 loved
7 loved
8 loved
9 loved
10 loved
答案 0 :(得分:1)
您忘记在WORD::word
之前为strcpy
分配内存。并且您将char word[100]
的地址分配给地图的所有项目,这些项目对所有项目都是相同的。
最好使用std::string
而不是C风格的字符串。此外,您可以使用std::stoi
将字符串转换为整数。试试这个:
struct WORD{
int id;
int count;
std::string word;
};
std::map<int, WORD> readWordMap(const std::string &file_name)
{
...
std::map<int, WORD> map;
...
while (std::getline(infile, line)) {
...
WORD entry;
entry.id = std::stoi(tokens[0]);
entry.count = std::stoi(tokens[2]);
entry.word = tokens[1];
map[entry.id] = entry;
...
}
infile.close();
...
}
答案 1 :(得分:1)
WORD* entry = (WORD*) malloc(sizeof(WORD*));
分配WORD pointer
而不是整个WORD
结构。
编译器一直在分配条目,因为没有被任何事情搞砸(它们都指向一些随机地址,甚至可能不属于你的程序。)而你重复地将该指针添加到地图。因此,地图的所有第一个都指向同一位置(巧合)。它应该是
WORD* entry = new WORD;
这是一种更清洁的方式
struct WORD{
int id;
int count;
std::string word;
};
while (std::getline(infile, line)) {
WORD* entry = new WORD;
std::istringstream iss(line);
iss >> entry->id >> entry->word >> entry->count;
map[entry->id] = entry;
std::cout<< entry->id<<" "<<entry->word<<" "<<entry->count<<std::endl;
}