我知道net.params[layer].diff
应该是权重损失函数的导数,即net.params[layer].data
,但是,我对以下示例感到困惑:它是一个3层(ip1,ip2,ip3) )MNIST
的完全连接网:
import caffe
caffe.set_mode_cpu()
import numpy as np
solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from('iter_18000.caffemodel')
solver.net.forward()
solver.net.backward()
# the computed derivatives of ip3
# shape of ip3: (10, 300)
computed = np.dot(np.transpose(solver.net.blobs['ip3'].diff),
solver.net.blobs['ip2'].data)
# actual derivatives of ip3
actual = solver.net.params['ip3'][0].diff
print np.count_nonzero(computed - actual)
结果是2260.有人可以解释一下吗?许多人。
答案 0 :(得分:1)
已经解决了! the computed derivatives of ip3
应为np.dot(np.transpose(solver.net.blobs['ip3'].diff), solver.net.blobs['relu2'].data)
,也就是说,在我的示例中,relu是一个额外的图层。