我目前正在tensorflow-1.14
上使用MultiWorkerMirroredStrategy
并在tensorflow-1.14
上通过3d卷积网络在AI平台上进行训练。由于AI平台培训未正式支持1.14,因此我将MultiWorkerMirroredStrategy
添加为使该策略生效的依赖项。
问题是,当我尝试使用批处理规范化时,我在使用组规范化(来自contrib程序包)时顺利运行时遇到了问题。
def conv(
inputs, is_training, filters=32, kernel_size=(3, 3, 3), strides=(1, 1, 1),
padding='same', activation_fn='relu'):
""" Function for 3D Convolution. """
conv = tf.layers.conv3d(inputs, filters, kernel_size=kernel_size,
strides=strides, padding=padding)
bn = tf.layers.batch_normalization(conv, training=is_training)
if activation_fn == None:
out = bn
else:
out = tf.nn.relu6(bn)
return out
应该完全可以在CMLE上正常工作吗?还是我缺少某些东西?
InvalidArgumentError: From /job:chief/replica:0/task:0: Lower bound check fail for input 4 from node gradients/F9/e-3/batch_normalization_24/batchnorm/add_1_grad/Sum_1 to node scoped_allocator_concat_424_31 input bounds = [0x7fcae8cce500, 0x7fcae8cce600] backing_tensor bounds = [0x7fcae8f5fb00, 0x7fcae8fe0f88] [[{{node scoped_allocator_concat_424_31}}]]
错误:
UserControl