Question

我有一个二进制文件，其中我保存了以下变量数百万次：

浮动的x大小向量
两个无符号整数

目前我正在使用ifstream打开和读取文件，但我想知道是否可以通过将整个文件加载到内存并减少I / O来加快执行时间。

如何将文件加载到内存中，然后将其转换为我想要的变量？使用ifstream这很容易，但我不知道如何缓冲它然后提取数据。

这是我用来保存数据的代码：

osfile.write(reinterpret_cast<const char*> (&sz), sizeof(int));// Size of vector
osfile.write(reinterpret_cast<const char*> (&vec[0]), sz*sizeof(float));
osfile.write(reinterpret_cast<const char*> (&a), sizeof(unsigned int));
osfile.write(reinterpret_cast<const char*> (&b), sizeof(unsigned int));

Answer 1

我想您的写入过程中缺少某些东西，因为写入流中缺少向量的大小......

size_t size = vec.size();
osfile.write(reinterpret_cast<const char*> (&size), sizeof(size_t));
osfile.write(reinterpret_cast<const char*> (&vec[0]), vec.size()*sizeof(float));

osfile.write(reinterpret_cast<const char*> (&i), sizeof(unsigned int));
osfile.write(reinterpret_cast<const char*> (&i), sizeof(unsigned int));

然后您可以将全局文件缓冲区加载到内存中： Read whole ASCII file into C++ std::string

然后，将加载的缓冲区传递给istringstream iss;对象

然后，以与编写流相同的方式读取流（流方法）：

float tmp;
size_t size_of_vector;
// read size of vector
iss >> size_of_vector;
// allocate once
vector<float> vec(size_of_vector);
// read content
while(size_of_vector--)
{
    iss >> tmp;
    vec.push_back(tmp);
}
// at the end, read your pair of int
unsigned int i1,i2;
iss >> i1;
iss >> i2;

编辑：打开/读取流时，您仍然需要考虑二进制与字符的关系......

Answer 2

这是我建议的方法。首先，将整个文件读入缓冲区：

std::ifstream binFile("your_binary_file", std::ifstream::binary);
if(binFile) {
    // get length of file
    binFile.seekg(0, binFile.end);
    size_t length = static_cast<size_t>(binFile.tellg());
    binFile.seekg(0, binFile.beg);

    // read whole contents of the file to a buffer at once
    char *buffer = new char[length];
    binFile.read(buffer, length);
    binFile.close();

    ...

然后，使用这种方法提取矢量和整数：

    size_t offset = 0;
    int vectorSize = *reinterpret_cast<int*>(buffer);
    offset += sizeof(int);

    float *vectorData = reinterpret_cast<float*>(buffer + offset);
    std::vector<float> floats(vectorSize);
    std::copy(vectorData, vectorData + vectorSize, floats.begin());
    offset += sizeof(float) * vectorSize;

    int i1 = *reinterpret_cast<int*>(buffer + offset);
    offset += sizeof(int);
    int i2 = *reinterpret_cast<int*>(buffer + offset);

最后，当读取所有数据时，不要忘记删除为缓冲区分配的内存：

    delete[] buffer;
}

C ++在内存中加载二进制文件并获取对象

2 个答案: