Question

我正在研究张量流的预测结果，并且该预测花费了大约15分钟用于列表中的每个元素。因此，我想在有8 gpus的多gpu设置中进行预测，从而使我有可能在15分钟内进行8次此类预测。

我听说python中的parallel.futures模块可以帮助我们在python中进行多处理，并尝试将iterable（datatype-list）分配给使用并发.futures中的ProcessPoolExecutor进行预测的函数。


def function(list_of_queries , GPU_node = "0"):

    os.environ["CUDA_VISIBLE_DEVICES"] = GPU_node
    print(os.getpid())
    print(os.environ["CUDA_VISIBLE_DEVICES"])
    core_config = tf.ConfigProto()
    core_config.gpu_options.allow_growth = True
    session = tf.Session(config=core_config)
    import nvgpu
    print(nvgpu.available_gpus(),"is the available gpu")
    print(nvgpu.gpu_info())
    import nvgpu
    print(nvgpu.available_gpus(),"is the available gpu\n")
    print(nvgpu.gpu_info())

    if tf.test.gpu_device_name():
        print('Default GPU Device: {}\n'.format(tf.test.gpu_device_name()))

    from tensorflow.python.client import device_lib
    local_device_protos = device_lib.list_local_devices()
    print([x.name for x in local_device_protos if x.device_type == 'GPU'])

    return [list_of_queries, GPU_device]

from tensorflow.python.client import device_lib
def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']
gpus = get_available_gpus()

with concurrent.futures.ProcessPoolExecutor(max_workers = len(gpus)) as executor:
    results = [x for x in executor.map(func ,list_of_queries, gpus)]
    print('results: ', results)

我曾预计tensorflow-gpu模块在预测输出时会使用所有8 gpu，但是相反，它仅完全使用一个gpu，而在其他7 gpu中的占用空间很小。我收到如下错误：BrokenProcessPool：在将来运行或挂起时，进程池中的进程突然终止。

如果您能告诉我上述方法有什么问题，以及是否有办法为每个工作人员分配一个GPU，并让他们使用预测函数-predict_similar进行预测，这将很有帮助？

Tensorflow中的多GPU推理/预测

0 个答案: