我想计算神经网络参数的梯度。我写了一个简单的玩具程序,但渐变不是workig,而是转向错误,我不明白。
import tensorflow as tf
import numpy as np
def var_init(n_input, n_output):
return np.sqrt(6. / (n_input + n_output))
l1_init = var_init(1, 5)
W = tf.Variable(
tf.random_uniform([1, 5], minval=-l1_init,
maxval=l1_init,
dtype=tf.float64), name='W')
b = tf.Variable(
tf.random_uniform([5], minval=-l1_init, maxval=l1_init,
dtype=tf.float64), name='b')
layer1 = tf.nn.tanh(tf.matmul(np.matrix([[5.]]), W) + b)
init = tf.global_variables_initializer()
gr = tf.gradients(layer1,[W,b])
sess = tf.Session()
sess.run(init)
sess.run(gr)
错误是:
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(1, 1), b.shape=(1, 5), m=1, n=5, k=1
[[Node: MatMul = MatMul[T=DT_DOUBLE, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](MatMul/a, W/read)]]
由op u' MatMul'引起,定义于:
layer1 = tf.nn.tanh(tf.matmul(np.matrix([[5.]]), W) + b)
(我无法发布整个Traceback,因为stackoverflow抱怨我的问题是大多数代码)