我尝试创建两个具有相同权重和偏差的网络,我期望类似的学习曲线。在迭代2中,两个网络中的所有blob都是相同的(数据和差异)但是params(权重和偏差)是不同的!
我在这里做错了什么?
注意:在网络中,数据集和丢失层没有任何改组。
由于
solver1 = caffe.SGDSolver('lenet_solver.prototxt')
solver2 = caffe.SGDSolver('lenet_solver.prototxt')
solver1.step(1)
solver2.step(1)
CopySolver(solver1,solver2)
for i in range(10):
solver1.step(1)
solver2.step(1)
print solver1.net.params['ip2'][1].diff
print solver2.net.params['ip2'][1].diff
def CopySolver(SolverA,SolverB):
params = SolverA.net.params.keys()
paramsA = {pr: (SolverA.net.params[pr][0].data,SolverA.net.params[pr][1].data) for pr in params}
paramsB = {pr: (SolverB.net.params[pr][0].data,SolverB.net.params[pr][1].data) for pr in params}
for pr in params:
paramsB[pr][1][...] = paramsA [pr][1] #bias
paramsB[pr][0][...] = paramsA [pr][0] #weights
答案 0 :(得分:1)
你没有考虑解算器的动力。将净参数从一个求解器对象复制到另一个求解器对象后,求解器(如SGD)的动量信息在求解器1和求解器2之间仍然不同。如果你设置"动量:0"在你的" lenet_solver.prototxt"你应该得到预期的行为。
否则您还可以保存参数,创建两个新的求解器对象,加载参数并重新开始训练。这样做,你可以确保两者都没有初始动力。 这是一个如何看起来的例子:
solver1 = caffe.SGDSolver('lenet_solver.prototxt')
solver2 = caffe.SGDSolver('lenet_solver.prototxt')
solver1.step(1)
solver2.step(1)
solver1.net.save("tmp.caffemodel")
solver1 = caffe.SGDSolver('lenet_solver.prototxt')
solver2 = caffe.SGDSolver('lenet_solver.prototxt')
solver1.net.copy_from("tmp.caffemodel")
solver2.net.copy_from("tmp.caffemodel")
for i in range(10):
solver1.step(1)
solver2.step(1)
print solver1.net.params['ip2'][1].diff
print solver2.net.params['ip2'][1].diff
答案 1 :(得分:0)