在tensorflow中的Gradient Descent Optimizer上同时使用gradient_apply和自适应学习率。

时间:2018-09-23 19:34:44

标签: tensorflow gradient-descent

我已经实现了一个简单的网络来解决梯度问题,同时我对将梯度下降优化器的自适应学习率感兴趣。

我可以访问渐变并将其功能应用于渐变。像这样:

batch_size = 100
y_pred = tf.nn.softmax(layer_fc2)
y_pred_cls = tf.argmax(y_pred, axis=1)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, 
logits=layer_fc2)
cost = tf.reduce_mean(cross_entropy)
batch = tf.Variable(0)

optimizer = tf.train.GradientDescentOptimizer(learning_rate = 1e-5)


params2 = [weights_conv2,weights_conv1]

gradients = tf.gradients(cost, params2)

new_gradients = do_something(gradients, session)

gradients = optimizer.apply_gradients(zip(new_gradients, 
params2),global_step=batch)

我可以单独使用优化器来使用自适应学习率,如下所示:

batch_size = 100
y_pred = tf.nn.softmax(layer_fc2)
y_pred_cls = tf.argmax(y_pred, axis=1)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, 
logits=layer_fc2)
cost = tf.reduce_mean(cross_entropy)
batch = tf.Variable(0)


train_size = data.train.labels.shape[0]
learning_rate = tf.train.exponential_decay(
      1e-5,                # Base learning rate.
      batch * batch_size,  # Current index into the dataset.
      2000,          # Decay step.
      0.95,                # Decay rate.
      staircase=True) 
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost, 
global_step=batch) 

它们都起作用。但是,如果我想同时使用我的梯度下降优化器,则自适应学习率将不起作用,并且优化器仅使用恒定的学习率。这就是我将它们一起使用的方式:

def do_something(gradssss):

    gradssss = gradssss*2
    return gradssss


batch_size = 100
y_pred = tf.nn.softmax(layer_fc2)
y_pred_cls = tf.argmax(y_pred, axis=1)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, 
logits=layer_fc2)
cost = tf.reduce_mean(cross_entropy)
batch = tf.Variable(0)

train_size = data.train.labels.shape[0]
learning_rate = tf.train.exponential_decay(
      1e-5,                # Base learning rate.
      batch * batch_size,  # Current index into the dataset.
      2000,          # Decay step.
      0.95,                # Decay rate.
      staircase=True)

optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate)


params2 = [weights_conv2,weights_conv1]

gradients = tf.gradients(cost, params2)

new_gradients = do_something(gradients, session)

gradients = optimizer.apply_gradients(zip(new_gradients, 
params2),global_step=batch)

我在这里错过了什么吗?任何建议都非常感谢。

0 个答案:

没有答案