Question

此处的用法与Using read() directly into a C++ std:vector相同，但需要重新分配。

输入文件的大小未知，因此当文件大小超过缓冲区大小时，通过加倍大小来重新分配缓冲区。这是我的代码：

#include <vector>
#include <fstream>
#include <iostream>

int main()
{
    const size_t initSize = 1;
    std::vector<char> buf(initSize); // sizes buf to initSize, so &buf[0] below is valid
    std::ifstream ifile("D:\\Pictures\\input.jpg", std::ios_base::in|std::ios_base::binary);
    if (ifile)
    {
        size_t bufLen = 0;
        for (buf.reserve(1024); !ifile.eof(); buf.reserve(buf.capacity() << 1))
        {
            std::cout << buf.capacity() << std::endl;
            ifile.read(&buf[0] + bufLen, buf.capacity() - bufLen);
            bufLen += ifile.gcount();
        }
        std::ofstream ofile("rebuild.jpg", std::ios_base::out|std::ios_base::binary);
        if (ofile)
        {
            ofile.write(&buf[0], bufLen);
        }
    }
}

程序按预期打印矢量容量，并将输出文件写入与输入BUT相同的大小，只有与偏移initSize之前的输入相同的字节，之后全部为零... < / p>

在&buf[bufLen]中使用read()绝对是一种未定义的行为，但是&buf[0] + bufLen得到了正确的写入，因为连续分配是有保证的，不是吗？（提供initSize != 0。请注意std::vector<char> buf(initSize);尺寸为buf至initSize。是的，如果initSize == 0，我的环境中会出现朗姆酒致命错误。）想念什么？ 这也是UB吗？标准是否说明了std :: vector的用法？

PS：是的，我知道我们可以先计算文件大小并分配完全相同的缓冲区大小，但在我的项目中，可以预期输入文件几乎总是小于某个SIZE，所以我可以将initSize设置为SIZE并期望没有开销（比如文件大小计算），并且仅使用重新分配来进行“异常处理”。是的，我知道我可以将reserve()替换为resize()而将capacity()替换为size()，然后以较少的开销（每次调整大小时将缓冲区归零）进行操作，但是我仍然希望摆脱任何重复操作，只是一种偏执......

更新1：

事实上，我们可以从标准中推断&buf[0] + bufLen得到正确的帖子，考虑：

std::vector<char> buf(128);
buf.reserve(512);
char* bufPtr0 = &buf[0], *bufPtrOutofRange = &buf[0] + 200;
buf.resize(256); std::cout << "standard guarantees no reallocation" << std::endl;
char* bufPtr1 = &buf[0], *bufInRange = &buf[200]; 
if (bufPtr0 == bufPtr1)
    std::cout << "so bufPtr0 == bufPtr1" << std::endl;
std::cout << "and 200 < buf.size(), standard guarantees bufInRange == bufPtr1 + 200" << std::endl;
if (bufInRange == bufPtrOutofRange)
    std::cout << "finally we have: bufInRange == bufPtrOutofRange" << std::endl;

输出：

standard guarantees no reallocation
so bufPtr0 == bufPtr1
and 200 < buf.size(), standard guarantees bufInRange == bufPtr1 + 200
finally we have: bufInRange == bufPtrOutofRange

此处200可以替换为buf.size() <= i < buf.capacity()，并且类似的扣除也可以。

更新2：

是的，我确实错过了一些东西......但问题不在于连续性（参见更新1），甚至没有写入内存失败。今天我有时间研究这个问题，程序得到正确的地址，将正确的数据写入保留内存，但在下一个reserve()中，buf被重新分配并且只有范围[0, buf.size())中的元素被复制到新内存中。所以这就是整个谜语的答案......

最后注意事项：如果在缓冲区填充了一些数据后无需重新分配，则绝对可以使用reserve()/capatity()代替resize()/size()，但如果需要，请使用后者。

示例：

const size_t initSize = 32;
std::vector<char> buf(initSize);
buf.reserve(1024*100); // reserve enough space for file reading
std::ifstream ifile("D:\\Pictures\\input.jpg", std::ios_base::in|std::ios_base::binary);
if (ifile)
{
    ifile.read(&buf[0], buf.capacity());  // ok. the whole file is read into buf
    std::ofstream ofile("rebuld.jpg", std::ios_base::out|std::ios_base::binary);
    if (ofile)
    {
        ofile.write(&buf[0], ifile.gcount()); // rebuld.jpg just identical to input.jpg
    }
}
buf.reserve(1024*200); // horror! probably always lose all data in buf after offset initSize

PS：我没有发现任何权威来源（标准，TC ++ PL等）明确同意或不同意我提出的上述建议。但是根据此处提供的所有实现（VC ++，g ++，ICC），上面的示例工作正常。

这是另一个例子，引自'TC ++ PL，4e'第1041页，注意函数的第一行使用reserve()而不是resize()：

void fill(istream& in, string& s, int max)
// use s as target for low-level input (simplified)
{
    s.reserve(max); // make sure there is enough allocated space
    in.read(&s[0],max);
    const int n = in.gcount(); // number of characters read
    s.resize(n);
    s.shrink_to_fit();  // discard excess capacity
}

Answer 1

reserve实际上并没有向向量添加空格，它只会确保在调整大小时不需要重新分配。您应该使用reserve，而不是使用resize，一旦您知道实际读取了多少字节，就可以进行最终resize。

编辑：保证reserve所做的一切都是为了防止迭代器和指针失效，因为你将向量的大小增加到capacity()。 not 保证保留这些保留字节的内容，除非它们是size()的一部分。

使用std：vector作为低级缓冲区

1 个答案: