我有file个包含表格的行
double mass, string seq, int K, int TS, int M, [variable number of ints]
688.83 AFTDSK 1 1 0 3384 2399 1200
790.00 MDSSTK 1 3 1 342 2
我需要一种(最好是简单的)方法来解析这个文件而不需要提升。如果每行的值数量不变,那么我会使用解here。
每一行都将成为Peptide类的对象:
class Peptide {
public:
double mass;
string sequence;
int numK;
int numPTS;
int numM;
set<int> parents;
}
前三个整数在对象中有特定的变量名,而所有以下整数都需要插入一个集合中。
我很幸运能得到两个非常棒的回复,但运行时差异使C实现成为我的最佳答案。
答案 0 :(得分:10)
如果要使用C ++,请使用C ++:
std::list<Peptide> list;
std::ifstream file("filename.ext");
while (std::getline(file, line)) {
// Ignore empty lines.
if (line.empty()) continue;
// Stringstreams are your friends!
std::istringstream row(line);
// Read ordinary data members.
Peptide peptide;
row >> peptide.mass
>> peptide.sequence
>> peptide.numK
>> peptide.numPTS
>> peptide.numM;
// Read numbers until reading fails.
int parent;
while (row >> parent)
peptide.parents.insert(parent);
// Do whatever you like with each peptide.
list.push_back(peptide);
}
答案 1 :(得分:3)
我知道解析ascii文本文件的最好方法是逐行读取并使用strtok。这是一个C函数,但它会打破你对个人令牌的输入。然后,您可以使用字符串解析函数atoi和strtod来解析数值。对于您指定的文件格式,我会执行以下操作:
string line;
ifstream f(argv[1]);
if(!f.is_open()) {
cout << "The file you specified could not be read." << endl;
return 1;
}
while(!f.eof()) {
getline(f, line);
if(line == "" || line[0] == '#') continue;
char *ptr, *buf;
buf = new char[line.size() + 1];
strcpy(buf, line.c_str());
Peptide pep;
pep.mass = strtod(strtok(buf, " "), NULL);
pep.sequence = strtok(NULL, " ");
pep.numK = strtol(strtok(NULL, " "), NULL, 10);
pep.numPTS = strtol(strtok(NULL, " "), NULL, 10);
pep.numM = strtol(strtok(NULL, " "), NULL, 10);
while(ptr = strtok(NULL, " "))
pep.parents.insert(strtol(ptr, NULL, 10));
cout << "mass: " << mass << endl
<< "sequence: " << sequence << endl
<< "numK: " << numK << endl
<< "numPTS: " << numPTS << endl
<< "numM: " << numM << endl
<< "parents:" << endl;
set<int>::iterator it;
for(it = parents.begin(); it != parents.end(); it++)
cout << "\t- " << *it << endl;
}
f.close();