我想通过两个步骤实现我的项目:1。使用一些数据训练网络; 2.使用其他一些数据转动训练好的网络。
第一步(训练网络),我得到了一个不错的结果。但是,对于第二步(转向网络),会出现问题:参数不会更新。更多细节如下:
我的损失包括两件事:1。我的项目的正常成本。 2. L2正规化项目。给出如下:
http://localhost:8080/styles/klokantech-basic/{z}/{x}/{y}.png
http://localhost:8080/styles/osm-bright/{z}/{x}/{y}.png
调整网络时,我会打印损失,成本和L2项目:
c1 = y_conv - y_
c2 = tf.square(c1)
c3 = tf.reduce_sum(c2,1)
c4 = tf.sqrt(c3)
cost = tf.reduce_mean(c4)
regular = 0.0001*( tf.nn.l2_loss(w_conv1) + tf.nn.l2_loss(b_conv1) +\
tf.nn.l2_loss(w_conv2) + tf.nn.l2_loss(b_conv2) +\
tf.nn.l2_loss(w_conv3) + tf.nn.l2_loss(b_conv3) +\
tf.nn.l2_loss(w_conv4) + tf.nn.l2_loss(b_conv4) +\
tf.nn.l2_loss(w_fc1) + tf.nn.l2_loss(b_fc1) +\
tf.nn.l2_loss(w_fc2) + tf.nn.l2_loss(b_fc2) )
loss = regular + cost
正如我们所看到的,L2项目不会更新,但成本和损失会更新。为了检查网络参数是否更新,我计算了值:
Epoch: 1 || loss = 0.184248179 || cost = 0.181599200 || regular = 0.002648979
Epoch: 2 || loss = 0.184086733 || cost = 0.181437753 || regular = 0.002648979
Epoch: 3 || loss = 0.184602532 || cost = 0.181953552 || regular = 0.002648979
Epoch: 4 || loss = 0.184308948 || cost = 0.181659969 || regular = 0.002648979
Epoch: 5 || loss = 0.184251788 || cost = 0.181602808 || regular = 0.002648979
Epoch: 6 || loss = 0.184105504 || cost = 0.181456525 || regular = 0.002648979
Epoch: 7 || loss = 0.184241678 || cost = 0.181592699 || regular = 0.002648979
Epoch: 8 || loss = 0.184189570 || cost = 0.181540590 || regular = 0.002648979
Epoch: 9 || loss = 0.184390061 || cost = 0.181741081 || regular = 0.002648979
Epoch: 10 || loss = 0.184064055 || cost = 0.181415075 || regular = 0.002648979
Epoch: 11 || loss = 0.184323867 || cost = 0.181674888 || regular = 0.002648979
Epoch: 12 || loss = 0.184519534 || cost = 0.181870555 || regular = 0.002648979
Epoch: 13 || loss = 0.183869445 || cost = 0.181220466 || regular = 0.002648979
Epoch: 14 || loss = 0.184313927 || cost = 0.181664948 || regular = 0.002648979
Epoch: 15 || loss = 0.184198738 || cost = 0.181549759 || regular = 0.002648979
所以bconv1是一部分参数,我确认bconv1不会在两个时期之间更新。 我很困惑,为什么成本/损失更新,而网络参数不更新。
除CNN图层外的整个代码是:
gs, lr, solver, l, c, r, pY, bconv1 = sess.run([global_step, learning_rate, train, loss, cost, regular, y_conv, b_conv1], feed_dict={x: batch_X, y_: batch_Y, keep_prob:0.5})
任何建议对我都很重要。提前谢谢。
张强答案 0 :(得分:0)
实际上,我以为我找出了这个问题,但事实并非如此。我只知道这个bug会导致什么结果。 参数不更新的原因是在预训练之后global_step非常大,因此学习速率非常小(大约1e-24)。那么,我应该做的是在恢复网络参数后将global_step设置为0。此外,学习率也应该设置为agin。
代码应如下所示:
saver.restore(sess,'../TrainingData/convParameters.ckpt')
global_step = tf.Variable(0, trainable=False)
learning_rate = tf.train.exponential_decay(initial_learning_rate,
global_step=global_step,
decay_steps=int( X.shape[0]/1000 ),decay_rate=0.99, staircase=True)
然后,您可以获取global_step和学习率的值来检查它是否正常:
gafter,lrafter = sess.run([global_step,learning_rate])
必须在恢复网络参数后完成。
我虽然通过上面的代码解决了这个bug。但是,global_step在训练时不会更新。
我所做的是:
重置优化程序,如:
global_step = tf.Variable(0,trainable = False) learning_rate = tf.train.exponential_decay(initial_learning_rate, global_step = global_step, decay_steps = int(X.shape [0] / 1000),decay_rate = 0.99,staircase = True) train = tf.train.AdamOptimizer(learning_rate).minimize(loss,global_step = global_step) global_step_init = tf.initialize_variables([global_step]) sess.run(global_step_init) 但我被告知我正在使用未初始化的变量。
优化器初始化:
global_step_init = tf.initialize_variables([global_step,train])
我被告知火车无法初始化。
我好累。最后,我放弃了。我只是将学习率设置为占位符,就像:enter link description here
如果有人提供解决方案,请告诉我。非常感谢。