我需要通过tf.gradients()获取权重和偏差的梯度:
x = tf.placeholder(tf.float32, [batch_size, x_train.shape[1]])
y = tf.placeholder(tf.float32, [batch_size, y_train.shape[1]])
y_ = tf.placeholder(tf.float32, [batch_size, y_train.shape[1]])
Wx=tf.Variable(tf.random_normal(stddev=0.1,shape=[x_train.shape[1],n_hidden]))
Wy=tf.Variable(tf.random_normal(stddev=0.1,shape=[y_train.shape[1],n_hidden]))
b=tf.Variable(tf.constant(0.1,shape=[n_hidden]))
hidden_joint=tf.nn.relu((tf.matmul(x,Wx)+tf.matmul(y,Wy))+b)
hidden_marg=tf.nn.relu(tf.matmul(x,Wx)+tf.matmul(y_,Wy)+b)
Wout=tf.Variable(tf.random_normal(stddev=0.1,shape=[n_hidden, 1]))
bout=tf.Variable(tf.constant(0.1,shape=[1]))
out_joint=tf.matmul(hidden_joint,Wout)+bout
out_marg=tf.matmul(hidden_marg,Wout)+bout
optimizer = tf.train.AdamOptimizer(0.005)
t = out_joint
et = tf.exp(out_marg)
ex_delta_t = tf.reduce_mean(tf.gradients(t, tf.trainable_variables()))
ex_delta_et = tf.reduce_mean(tf.gradients(et, tf.trainable_variables()))
但是我总是收到以下错误:
File "/home/ferdi/Documents/mine/mine.py", line 77, in get_mi_batched
ex_delta_t = tf.reduce_mean(tf.gradients(t, tf.trainable_variables()))
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1490, in reduce_mean
reduction_indices),
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1272, in _ReductionDims
return range(0, array_ops.rank(x))
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 368, in rank
return rank_internal(input, name, optimize=True)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 388, in rank_internal
input_tensor = ops.convert_to_tensor(input)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1048, in convert_to_tensor
as_ref=False)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1144, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 971, in _autopacking_conversion_function
return _autopacking_helper(v, dtype, name or "packed")
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 923, in _autopacking_helper
return gen_array_ops.pack(elems_as_tensors, name=scope)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 4689, in pack
"Pack", values=values, axis=axis, name=name)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
op_def=op_def)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1790, in __init__
control_input_ops)
File "/home/ferdi/anaconda3/envs/ml_all/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1629, in _create_c_op
raise ValueError(str(e))
ValueError: Shapes must be equal rank, but are 2 and 1
From merging shape 3 with other shapes. for 'Rank/packed' (op: 'Pack') with input shapes: [512,20], [10,20], [20], [20,1], [1].
如果我重塑形状或做类似的事情,则还会发生其他错误。我知道有很多类似的问题,但我仍然无法弄清楚。我在做什么错了?
答案 0 :(得分:1)
解决方案:
ex_delta_t = tf.reduce_mean( tf.concat([tf.reshape(g, [-1]) for g in tf.gradients(t, tf.trainable_variables())], axis=0))
ex_delta_et = tf.reduce_mean( tf.concat([tf.reshape(g, [-1]) for g in tf.gradients(et, tf.trainable_variables())], axis=0))
或展开相同的代码:
grads_t_0 = tf.gradients(t, tf.trainable_variables())
grads_et_0 = tf.gradients(t, tf.trainable_variables())
grads_t = []
grads_et = []
for gt,get in zip(grads_t_0, grads_et_0):
grads_t.append(tf.reshape(gt, [-1]))
grads_et.append(tf.reshape(get, [-1]))
grads_t_flatten = tf.concat(grads_t, axis=0)
grads_et_flatten = tf.concat(grads_et, axis=0)
ex_delta_t = tf.reduce_mean(grads_t_flatten)
ex_delta_et = tf.reduce_mean(grads_et_flatten)
说明:
由于您的梯度函数,您可能会收到此错误消息
tf.gradients(t, tf.trainable_variables())
tf.gradients(et, tf.trainable_variables()
返回乘法形状的张量。
结果,您的tf.reduce_mean()
操作抱怨说,它不适用于这种多重形状的张量。
作为解决此问题的一种可能性,您应该先展平而不是连接渐变列表,然后将其传递给reduce_mean函数。
让我们看一个简单的示例代码来模拟错误及其解决方案!
#You dummy gradients as the output of tf.gradients()
grad_wx = tf.constant(0.1, shape=[512, 20])
grad_wy = tf.constant(0.2, shape=[10, 20])
grad_b = tf.constant(0.3, shape=[20])
grad_wout = tf.constant(0.4, shape=[20, 1])
grad_bout = tf.constant(0.5, shape=[1])
grads_0 = [grad_wx, grad_wy, grad_b, grad_wout, grad_bout]
sess = tf.Session()
result = tf.reduce_mean(grads_0)
print(sess.run(result)
出局(错误):
ValueError: Shapes must be equal rank, but are 2 and 1
From merging shape 3 with other shapes. for 'Rank/packed' (op: 'Pack') with input shapes: [512,20], [10,20], [20], [20,1], [1].
解决方案:
result = tf.reduce_mean( tf.concat([tf.reshape(g, [-1]) for g in grads_0], axis=0))
print(sess.run(result))
退出(固定):
0.102899365