应用错误收集

我的环境是

CentOS 7
gcc（GCC）6.3.1 20170216（Red Hat 6.3.1-3）
TensorFlow2.0 C ++ API
具有24个CPU和1个Tesla T4

在我的工作中，我必须使用TensorFlow2.0 C ++ .so共享库将预训练的模型（pb文件）集成到我的c ++程序中。为了提高性能，我尝试在多个线程上运行不同的会话，如下所示：

$(e).magnificPopup({
    items: {
        type: 'inline',
        src: '<div class="container"><video controls preload="auto" autoplay width="100%" height="100%" src="' + videoSrc + '"></video></div>',
    }
});

问题在于推断的时间将与子线程的数量成比例地增加。当我使用一个子线程时，推理时间约为100毫秒，而对于4个子线程而言，推理时间约为400毫秒...即使考虑到多线程，GPU使用率始终约为36％。

有人对此有任何想法吗？

C ++ Tensorflow在多个线程上同时运行会话

0 个答案: