什么真的存储在Caffe的net.params [layer] .diff中?

时间:2017-12-31 04:10:59

标签: python deep-learning caffe pycaffe

我知道net.params[layer].diff应该是权重损失函数的导数,即net.params[layer].data,但是,我对以下示例感到困惑:它是一个3层(ip1,ip2,ip3) )MNIST的完全连接网:

import caffe
caffe.set_mode_cpu()
import numpy as np

solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from('iter_18000.caffemodel')

solver.net.forward()
solver.net.backward()

# the computed derivatives of ip3
# shape of ip3: (10, 300)
computed = np.dot(np.transpose(solver.net.blobs['ip3'].diff),
                               solver.net.blobs['ip2'].data)
# actual derivatives of ip3
actual = solver.net.params['ip3'][0].diff
print np.count_nonzero(computed - actual)

结果是2260.有人可以解释一下吗?许多人。

1 个答案:

答案 0 :(得分:1)

已经解决了! the computed derivatives of ip3应为np.dot(np.transpose(solver.net.blobs['ip3'].diff), solver.net.blobs['relu2'].data),也就是说,在我的示例中,relu是一个额外的图层。