当我在多GPU上训练神经网络时,批量标准化不起作用,我使用tf.layers.batch_normalization(),但是当我运行时,我得到以下错误:
ValueError:变量batch_normalization_4 / gamma不存在,或者未使用tf.get_variable()创建。你的意思是在VarScope中设置reuse = tf.AUTO_REUSE吗?
训练变量没有问题地共享,我的代码可以在不使用规范化层的情况下运行,但是,当我添加它时,我收到了这个错误。看起来tf.layers.batch_normalization()的gamma参数不是用tf.get_variable()创建的,有人可以帮忙吗?
您可以找到我的代码HERE,我可以注释掉规范化,但您可以根据需要撤消它。
完整错误是:
(tensorflow_p36) ubuntu@ip-172-31-24-137:~/Boyuan/MultiGPU$ python multi_gpu.py --num_gpus=2
/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/matplotlib/__init__.py:962: UserWarning: Duplicate key in file "/home/ubuntu/.config/matplotlib/matplotlibrc", line #2
(fname, cnt))
/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/matplotlib/__init__.py:962: UserWarning: Duplicate key in file "/home/ubuntu/.config/matplotlib/matplotlibrc", line #3
(fname, cnt))
Reading /home/ubuntu/Boyuan/MNIST_Dataset/mnist_train.tfrecords
Traceback (most recent call last):
File "multi_gpu.py", line 323, in <module>
tf.app.run()
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "multi_gpu.py", line 320, in main
train()
File "multi_gpu.py", line 221, in train
loss, correct = tower_loss(scope, images_batch, labels_batch)
File "multi_gpu.py", line 92, in tower_loss
logits = forward_propagation(images, layer_hidden_nums, True)
File "multi_gpu.py", line 56, in forward_propagation
Z_nor = tf.layers.batch_normalization(Z, training=training, momentum=0.9)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/layers/normalization.py", line 774, in batch_normalization
return layer.apply(inputs, training=training)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 762, in apply
return self.__call__(inputs, *args, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 636, in __call__
self.build(input_shapes)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/layers/normalization.py", line 277, in build
trainable=True)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 504, in add_variable
partitioner=partitioner)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1262, in get_variable
constraint=constraint)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1097, in get_variable
constraint=constraint)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 435, in get_variable
constraint=constraint)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 404, in _true_getter
use_resource=use_resource, constraint=constraint)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 761, in _get_single_variable
"reuse=tf.AUTO_REUSE in VarScope?" % name)
ValueError: Variable batch_normalization_4/gamma does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?
谢谢。