Question

如果我想在使用之前通过char处理文本文件char。什么方法最有效？

我可以这样做：

ifstream ifs("the_file.txt", ios_base::in);
char c;
while (ifs >> noskipws >> c) {
    // process c ...
}
ifs.close();

和此：

ifstream ifs("the_file.txt", ios_base::in);
stringstream sstr;
sstr << ifs.rdbuf();
string txt = sstr.str();
for (string::iterator iter = txt.begin(); iter != txt.end(); ++iter) {
    // process *iter ...
}

最终输出将根据迭代时找到的char进行拆分。

哪个更快？或许还有另一种更有效的方法？我是否需要为每个字符刷新stringstream（我读到flush影响性能的地方）？

Answer 1

a）测量（我猜第一个应该更快，因为它避免了额外的分配，但这只是猜测）

b）虽然它确实可能是一个非常糟糕的过早优化案例，但如果你真的需要最好的表现，请尝试以下方面：

int f = open(...);
//error handling here
char buf[256];
while(1) {
  int rd = read(f,buf,256);
  if( rd == 0 ) break;
  for(const char*p=buf;p<buf+rd;++p) {
    //process *p; note that this loop can be entered more than once
  }
}
close(f);

我很确定在性能方面打败这段代码非常困难（除非进入非常低级别的非标准IO）;然而，很可能很容易发生ifstream将产生可比较的结果。或者它可能不会。

注意：对于C ++，这种技术提供的差异（读取固定大小的缓冲区，然后是扫描缓冲区）很小，通常可以忽略不计，但对于其他语言，它可能很容易提供高达2倍的差异（已在Java上观察到）。

Answer 2

基于对20兆字节文件的粗略测试，此方法在0.1秒内将文件加载到一个字符串，而之前的rdbuf方法加载0.5秒。因此，除非您访问大量文件，否则基本没有区别。

ifstream ifs(filename, ios::binary);
string txt;
unsigned int cursor = 0;
const unsigned int readsize = 4096;
while (ifs.good())
{
    txt.resize(cursor + readsize);
    ifs.read(&txt[cursor], readsize);
    cursor += (unsigned int)ifs.gcount();
}
txt.resize(cursor);

哪个更快更有效，通过char作为char或stream处理char？

2 个答案: