Question

我需要通过tf.gradients（）获取权重和偏差的梯度：

        x = tf.placeholder(tf.float32, [batch_size, x_train.shape[1]])
        y = tf.placeholder(tf.float32, [batch_size, y_train.shape[1]])
        y_ = tf.placeholder(tf.float32, [batch_size, y_train.shape[1]])

        Wx=tf.Variable(tf.random_normal(stddev=0.1,shape=[x_train.shape[1],n_hidden]))
        Wy=tf.Variable(tf.random_normal(stddev=0.1,shape=[y_train.shape[1],n_hidden]))
        b=tf.Variable(tf.constant(0.1,shape=[n_hidden]))

        hidden_joint=tf.nn.relu((tf.matmul(x,Wx)+tf.matmul(y,Wy))+b)
        hidden_marg=tf.nn.relu(tf.matmul(x,Wx)+tf.matmul(y_,Wy)+b)

        Wout=tf.Variable(tf.random_normal(stddev=0.1,shape=[n_hidden, 1]))
        bout=tf.Variable(tf.constant(0.1,shape=[1]))

        out_joint=tf.matmul(hidden_joint,Wout)+bout
        out_marg=tf.matmul(hidden_marg,Wout)+bout

        optimizer = tf.train.AdamOptimizer(0.005)


        t = out_joint
        et = tf.exp(out_marg)

        ex_delta_t = tf.reduce_mean(tf.gradients(t, tf.trainable_variables()))
        ex_delta_et = tf.reduce_mean(tf.gradients(et, tf.trainable_variables()))

但是我总是收到以下错误：

  File "/home/ferdi/Documents/mine/mine.py", line 77, in get_mi_batched
    ex_delta_t = tf.reduce_mean(tf.gradients(t, tf.trainable_variables()))
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1490, in reduce_mean
    reduction_indices),
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1272, in _ReductionDims
    return range(0, array_ops.rank(x))
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 368, in rank
    return rank_internal(input, name, optimize=True)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 388, in rank_internal
    input_tensor = ops.convert_to_tensor(input)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1048, in convert_to_tensor
    as_ref=False)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1144, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 971, in _autopacking_conversion_function
    return _autopacking_helper(v, dtype, name or "packed")
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 923, in _autopacking_helper
    return gen_array_ops.pack(elems_as_tensors, name=scope)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 4689, in pack
    "Pack", values=values, axis=axis, name=name)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
    op_def=op_def)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1790, in __init__
    control_input_ops)
  File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1629, in _create_c_op
    raise ValueError(str(e))
ValueError: Shapes must be equal rank, but are 2 and 1
    From merging shape 3 with other shapes. for 'Rank/packed' (op: 'Pack') with input shapes: [512,20], [10,20], [20], [20,1], [1].

如果我重塑形状或做类似的事情，则还会发生其他错误。我知道有很多类似的问题，但我仍然无法弄清楚。我在做什么错了？

Answer 1

解决方案：

ex_delta_t = tf.reduce_mean( tf.concat([tf.reshape(g, [-1]) for g in tf.gradients(t, tf.trainable_variables())], axis=0))
ex_delta_et = tf.reduce_mean( tf.concat([tf.reshape(g, [-1]) for g in tf.gradients(et, tf.trainable_variables())], axis=0))

或展开相同的代码：

grads_t_0 = tf.gradients(t, tf.trainable_variables())
grads_et_0 = tf.gradients(t, tf.trainable_variables())

grads_t = []
grads_et = []
for gt,get in zip(grads_t_0, grads_et_0):
    grads_t.append(tf.reshape(gt, [-1]))
    grads_et.append(tf.reshape(get, [-1]))

grads_t_flatten = tf.concat(grads_t, axis=0)
grads_et_flatten = tf.concat(grads_et, axis=0)

ex_delta_t = tf.reduce_mean(grads_t_flatten)
ex_delta_et = tf.reduce_mean(grads_et_flatten)

说明：

由于您的梯度函数，您可能会收到此错误消息

tf.gradients(t, tf.trainable_variables())
tf.gradients(et, tf.trainable_variables()

返回乘法形状的张量。结果，您的tf.reduce_mean()操作抱怨说，它不适用于这种多重形状的张量。

作为解决此问题的一种可能性，您应该先展平而不是连接渐变列表，然后将其传递给reduce_mean函数。

让我们看一个简单的示例代码来模拟错误及其解决方案！

#You dummy gradients as the output of tf.gradients()
grad_wx = tf.constant(0.1, shape=[512, 20])
grad_wy = tf.constant(0.2, shape=[10, 20])
grad_b = tf.constant(0.3, shape=[20])
grad_wout = tf.constant(0.4, shape=[20, 1])
grad_bout = tf.constant(0.5, shape=[1])

grads_0 = [grad_wx, grad_wy, grad_b, grad_wout, grad_bout]

sess = tf.Session()

result = tf.reduce_mean(grads_0)
print(sess.run(result)

出局（错误）：

ValueError: Shapes must be equal rank, but are 2 and 1
    From merging shape 3 with other shapes. for 'Rank/packed' (op: 'Pack') with input shapes: [512,20], [10,20], [20], [20,1], [1].

解决方案：

result = tf.reduce_mean( tf.concat([tf.reshape(g, [-1]) for g in grads_0], axis=0))
print(sess.run(result))

退出（固定）：

0.102899365

tf.gradients：ValueError：形状必须等于等级，但必须为2和1

1 个答案: