Question

我想通过两个步骤实现我的项目：1。使用一些数据训练网络; 2.使用其他一些数据转动训练好的网络。

第一步（训练网络），我得到了一个不错的结果。但是，对于第二步（转向网络），会出现问题：参数不会更新。更多细节如下：

我的损失包括两件事：1。我的项目的正常成本。 2. L2正规化项目。给出如下：

http://localhost:8080/styles/klokantech-basic/{z}/{x}/{y}.png
http://localhost:8080/styles/osm-bright/{z}/{x}/{y}.png

调整网络时，我会打印损失，成本和L2项目：

c1 = y_conv - y_
c2 = tf.square(c1)
c3 = tf.reduce_sum(c2,1)
c4 = tf.sqrt(c3)
cost = tf.reduce_mean(c4)
regular = 0.0001*( tf.nn.l2_loss(w_conv1) + tf.nn.l2_loss(b_conv1) +\
              tf.nn.l2_loss(w_conv2) + tf.nn.l2_loss(b_conv2) +\
              tf.nn.l2_loss(w_conv3) + tf.nn.l2_loss(b_conv3) +\
              tf.nn.l2_loss(w_conv4) + tf.nn.l2_loss(b_conv4) +\
              tf.nn.l2_loss(w_fc1)   + tf.nn.l2_loss(b_fc1) +\
              tf.nn.l2_loss(w_fc2)   + tf.nn.l2_loss(b_fc2) )
loss = regular + cost

正如我们所看到的，L2项目不会更新，但成本和损失会更新。为了检查网络参数是否更新，我计算了值：

Epoch:     1 || loss = 0.184248179 || cost = 0.181599200 || regular = 0.002648979
Epoch:     2 || loss = 0.184086733 || cost = 0.181437753 || regular = 0.002648979
Epoch:     3 || loss = 0.184602532 || cost = 0.181953552 || regular = 0.002648979
Epoch:     4 || loss = 0.184308948 || cost = 0.181659969 || regular = 0.002648979
Epoch:     5 || loss = 0.184251788 || cost = 0.181602808 || regular = 0.002648979
Epoch:     6 || loss = 0.184105504 || cost = 0.181456525 || regular = 0.002648979
Epoch:     7 || loss = 0.184241678 || cost = 0.181592699 || regular = 0.002648979
Epoch:     8 || loss = 0.184189570 || cost = 0.181540590 || regular = 0.002648979
Epoch:     9 || loss = 0.184390061 || cost = 0.181741081 || regular = 0.002648979
Epoch:    10 || loss = 0.184064055 || cost = 0.181415075 || regular = 0.002648979
Epoch:    11 || loss = 0.184323867 || cost = 0.181674888 || regular = 0.002648979
Epoch:    12 || loss = 0.184519534 || cost = 0.181870555 || regular = 0.002648979
Epoch:    13 || loss = 0.183869445 || cost = 0.181220466 || regular = 0.002648979
Epoch:    14 || loss = 0.184313927 || cost = 0.181664948 || regular = 0.002648979
Epoch:    15 || loss = 0.184198738 || cost = 0.181549759 || regular = 0.002648979

所以bconv1是一部分参数，我确认bconv1不会在两个时期之间更新。我很困惑，为什么成本/损失更新，而网络参数不更新。

除CNN图层外的整个代码是：

gs, lr, solver, l, c, r, pY, bconv1 = sess.run([global_step, learning_rate, train, loss, cost, regular, y_conv, b_conv1], feed_dict={x: batch_X, y_: batch_Y, keep_prob:0.5})

任何建议对我都很重要。提前谢谢。

张强

Answer 1

实际上，我以为我找出了这个问题，但事实并非如此。我只知道这个bug会导致什么结果。参数不更新的原因是在预训练之后global_step非常大，因此学习速率非常小（大约1e-24）。那么，我应该做的是在恢复网络参数后将global_step设置为0。此外，学习率也应该设置为agin。

代码应如下所示：

saver.restore(sess,'../TrainingData/convParameters.ckpt')
global_step = tf.Variable(0, trainable=False) 
learning_rate = tf.train.exponential_decay(initial_learning_rate,
                                           global_step=global_step,
                                           decay_steps=int( X.shape[0]/1000 ),decay_rate=0.99, staircase=True)

然后，您可以获取global_step和学习率的值来检查它是否正常：

gafter,lrafter = sess.run([global_step,learning_rate])

必须在恢复网络参数后完成。

我虽然通过上面的代码解决了这个bug。但是，global_step在训练时不会更新。

我所做的是：

重置优化程序，如：

global_step = tf.Variable（0，trainable = False） learning_rate = tf.train.exponential_decay（initial_learning_rate， global_step = global_step， decay_steps = int（X.shape [0] / 1000），decay_rate = 0.99，staircase = True） train = tf.train.AdamOptimizer（learning_rate）.minimize（loss，global_step = global_step） global_step_init = tf.initialize_variables（[global_step]） sess.run（global_step_init）但我被告知我正在使用未初始化的变量。
优化器初始化：

global_step_init = tf.initialize_variables（[global_step，train]）

我被告知火车无法初始化。

我好累。最后，我放弃了。我只是将学习率设置为占位符，就像：enter link description here

如果有人提供解决方案，请告诉我。非常感谢。

Tensorflow：调整网络时参数不会更新

1 个答案: