Question

我正在尝试练习style transfer tutorial中的练习题，是否有人知道如何用Adam Optimizer替换基本的梯度下降。我认为这些代码可能是改变的地方。非常感谢您的帮助。

       # Reduce the dimensionality of the gradient.
        grad = np.squeeze(grad)

        # Scale the step-size according to the gradient-values.
        step_size_scaled = step_size / (np.std(grad) + 1e-8)

        # Update the image by following the gradient.
        mixed_image -= grad * step_size_scaled

Answer 1

参考Stanford CS231n slides的幻灯片36和37，

first_moment = 0
second_moment = 0

必须在该GitHub文件中存在的for i in range(num_iterations):行之上声明

。另外，根据您的要求从下面初始化beta1和beta2个变量。
然后，您可以使用以下代码替换代码块：

# Reduce the dimensionality of the gradient.
grad = np.squeeze(grad)

# Calculate moments
first_moment = beta1 * first_moment + (1 - beta1) * grad
second_moment = beta2 * second_moment + (1 - beta2) * grad * grad

# Bias correction steps
first_unbias = first_moment / (1 - beta1 ** i)
second_unbias = second_moment / (1 - beta2 ** i)

# Update the image by following the gradient (AdaGrad/RMSProp step)
mixed_image -= step_size * first_unbias / (tf.sqrt(second_unbias) + 1e-8)

Answer 2

我像这样初始化beta1和beta2：

  beta1=tf.Variable(0,name='beta1') 
  beta2=tf.Variable(0,name='beta2') 
  session.run([beta1.initializer,beta2.initializer])

然而，出现了问题：Tensor'对象没有属性'sqrt'。详细错误看起来像这样。

风格转移中的亚当优化器

2 个答案: