Question

我有两个大文件int而另一个float。我想将它们存储在2d vector。

中

阅读此类数据的最快方法是什么。

注意：每行中元素的数量在整个文档中都是唯一的。

我做了什么？

std::string temp;
std::ifstream infile(Path);
int i=0;
std::vector<std::vector<float> data(100, std::vector<float>(1000));
while (std::getline(infile, temp))
    {
        std::istringstream buffer(temp);
        int j = 0;
        while (!buffer.eof())
        {
            float temp2;
            buffer >> temp2;
            if (buffer.fail())
            {
                throw "Undefined variable in the input file!";
            }

                data.at(i).at(j) = temp2;
            j++;
        }
        i++;
    }

这段代码非常慢！

Answer 1

如果元素（和行）的数量是唯一的，则不能使用预先调整大小的向量和索引如果元素数量超出您的想象，它不仅会中断，您不替换的所有元素都将为零（或为空）。

相反，从空矢量开始并使用push_back 为避免重新分配向量，您可以先使用reserve。

这样的事情：

std::string line;
std::ifstream infile(Path);
std::vector<std::vector<float>> data;
data.reserve(100);  // Assuming no more than 100 lines

while (std::getline(infile, line))
{
    data.emplace_back();
    std::vector<float>& row = data.back();
    row.reserve(1000); // Assuming 1000 elements will do
    std::istringstream buffer(line);
    float element = 0;
    while (buffer >> element)
    {
        row.push_back(element);
    }
}

如果您想尽快阅读，请不要使用文本格式的数据。

Answer 2

几点提示，

通过添加以下内容禁用stdio同步：
```
std::ios::sync_with_stdio(false);
```

位于代码顶部。

重用你的std :: istringstream，把：
```
std::istringstream buffer(temp);
```

在你的循环之外，并在使用它之后用buffer.clear();

而不是：
```
data.at(i).at(j) = temp2;
```

使用：

data[i][j] = temp2;

这个版本没有检查边界，所以它的速度稍快。

有效地读取文件到2d阵列

2 个答案: