为什么softmax的梯度为零?
xval = np.array([0.3,0.4,0.2,0.2])
x = tf.placeholder(tf.float32, [4,])
y = tf.math.sin(x)
z = tf.nn.softmax(x)
g1 = tf.gradients(y, x)
g2 = tf.gradients(z, x)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
yvec, zvec, g1vec, g2vec = sess.run([y,z,g1,g2], feed_dict={x: xval})
我得到的是
yvec = [0.29552022 0.38941833 0.19866933 0.19866933]
zvec = [0.2554379 0.28230256 0.2311298 0.2311298 ]
g1vec =[array([0.9553365, 0.921061 , 0.9800666, 0.9800666], dtype=float32)] # which is equal to np.cos(xval)
g2vec = [array([0., 0., 0., 0.], dtype=float32)]
为什么梯度不是矩阵:dz_j / dx_i?