用于缓冲读取的C ++中的python生成器的等效项

时间:2011-01-13 22:06:15

标签: c++ python algorithm file io

Guido Van Rossum在此article中演示了Python的简单性,并利用此函数对未知长度的文件进行缓冲读取:

def intsfromfile(f):
    while True:
        a = array.array('i')
        if not a:
        for x in a:
            yield x

出于速度原因,我需要在C ++中做同样的事情!我有很多文件包含我需要合并的无符号64位整数的排序列表。我找到了code这个很好的合并矢量。

我坚持如何为未知长度的文件创建一个 ifstream 作为向量,可以愉快地迭代,直到文件结束为止到达。有什么建议?我用 istreambuf_iterator 吠叫正确的树吗?

1 个答案:

答案 0 :(得分:7)


#include <fstream>
#include <vector>
#include <iterator> // needed for istream_iterator

using namespace std;

int main(int argc, char** argv)
    ifstream infile("my-file.txt");

    // It isn't customary to declare these as standalone variables,
    // but see below for why it's necessary when working with
    // initializing containers.
    istream_iterator<int> infile_begin(infile);
    istream_iterator<int> infile_end;

    vector<int> my_ints(infile_begin, infile_end);

    // You can also do stuff with the istream_iterator objects directly:
    // Careful! If you run this program as is, this won't work because we
    // used up the input stream already with the vector.

    int total = 0;
    while (infile_begin != infile_end) {
        total += *infile_begin;

    return 0;


注意:Scott Meyers在 Effective STL 中解释了为什么上面需要istream_iterator的单独变量声明。通常,你会做这样的事情:

ifstream infile("my-file.txt");
vector<int> my_ints(istream_iterator<int>(infile), istream_iterator<int>());

然而,C ++实际上以令人难以置信的奇怪方式解析第二行。它将其视为名为my_ints的函数的声明,它接受两个参数并返回vector<int>。第一个参数是istream_iterator<int>类型,名为infile(忽略了parantheses)。第二个参数是一个没有名称的函数指针,它接受零参数(由于parantheses)并返回一个istream_iterator<int>类型的对象。




#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>

using namespace std;

int main(int argc, char** argv)
    ifstream input("my-file.txt");
    istreambuf_iterator<char> input_begin(input);
    istreambuf_iterator<char> input_end;

    // Fill a char vector with input file's contents:
    vector<char> char_input(input_begin, input_end);

    // Convert it to an array of unsigned long with a cast:
    unsigned long* converted = reinterpret_cast<unsigned long*>(&char_input[0]);
    size_t num_long_elements = char_input.size() * sizeof(char) / sizeof(unsigned long);

    // Put that information into a vector:
    vector<unsigned long> long_input(converted, converted + num_long_elements);

    return 0;
