Question

我目前正在使用经过训练的TensorFlow图在带有C ++ API的GPU机器上编写一些推理代码。

这是我的设置：

平台：CentOS 7
TensorFlow版本：TensorFlow 1.5
CUDA版本：CUDA 9.0
C ++版本：C ++ 11

我正在努力解决几个问题。

1）首先，我跟随this tutorial学习了用于在C ++中加载图形的基本模板。本教程中的示例非常简单，但是当我在GPU机器上运行该程序时，该程序在RAM中几乎占用了 0.9G 。

2）我的图比该教程中的图复杂得多。大约有20层，层中的节点数从300到5000不等。

我的（伪）代码段在这里。为简单起见，我只保留导致（潜在）内存问题的代码：

tensorflow::Tensor input = getDataFromSomewhere(...);
int length = size of the input;
int g_batch_size = 50;

// 1) Create session...
// 2) Load graph...

// 3) Inference
for (int x = 0; x < length; x += g_batch_size) {

    tensorflow::Tensor out;
    auto cur_slice = input.Slice(x, std::min(x + g_batch_size, length));

    inference(cur_slice, out);

    // doSomethingWithOutput(out);
}

// 4) Close session and free session memory


// Inference helper function
tensorflow::Status inference(tensorflow::Tensor& input_tensors, tensorflow::Tensor& out) {

    // This line increases a lot more memory usage
    TensorDict feed_dict = {{"IteratorGetNext:0", input_tensors}};
    std::vector<tensorflow::Tensor> outputs;

    tensorflow::Status status = session->Run(feed_dict, {"final_dense:0"}, {}, &outputs);

    // UpdateOutWithOutputs();

    return tensorflow::Status::OK();
}

创建会话并加载图形后，内存成本约为 1.2G 。

然后，正如我在代码中指出的那样，当程序到达session->Run(...)时，内存使用量将超过 2G 。

我不确定这是否是TensorFlow的正常行为。我已经检查了this和this线程，但是我不太确定是否在代码中创建了冗余操作。

任何评论或建议，不胜感激！预先感谢！

Answer 1

我发现的问题是Tensorflow动态库将占用大约 200MB ，而CUDA动态库将占用超过 500MB 的内存。因此，加载这些库已经占用了大量内存。

在C ++中运行TensorFlow时，RAM成本很高

1 个答案: