在tensorflow中使用multi-gpu时出现异常NotFoundError:libnccl.so.2

时间:2019-02-26 02:58:46

标签: python tensorflow gpu nvidia

当我运行以下代码以使用multi gpu选项时:

distribution = tf.contrib.distribute.MirroredStrategy()

run_config = tf.estimator.RunConfig(train_distribute=distribution)

estimator = tf.keras.estimator.model_to_estimator(model, config=run_config)

estimator.train(lambda: input_fn(train_images,
                                 train_labels,
                                 epochs=EPOCHS,
                                 batch_size=BATCH_SIZE))

但出现以下错误:

tensorflow.python.framework.errors_impl.NotFoundError: libnccl.so.2: cannot open shared object file: No such file or directory

我已经安装了tensorflow-gpu并点击了以下链接:

https://medium.com/tensorflow/multi-gpu-training-with-estimators-tf-keras-and-tf-data-ba584c3134db

我在这里想念什么?

0 个答案:

没有答案