我正在尝试练习style transfer tutorial中的练习题,是否有人知道如何用Adam Optimizer替换基本的梯度下降。 我认为这些代码可能是改变的地方。非常感谢您的帮助。
# Reduce the dimensionality of the gradient.
grad = np.squeeze(grad)
# Scale the step-size according to the gradient-values.
step_size_scaled = step_size / (np.std(grad) + 1e-8)
# Update the image by following the gradient.
mixed_image -= grad * step_size_scaled
答案 0 :(得分:0)
参考Stanford CS231n slides的幻灯片36和37,
first_moment = 0
second_moment = 0
必须在该GitHub文件中存在的for i in range(num_iterations):
行之上声明。另外,根据您的要求从下面初始化beta1
和beta2
个变量。
然后,您可以使用以下代码替换代码块:
# Reduce the dimensionality of the gradient.
grad = np.squeeze(grad)
# Calculate moments
first_moment = beta1 * first_moment + (1 - beta1) * grad
second_moment = beta2 * second_moment + (1 - beta2) * grad * grad
# Bias correction steps
first_unbias = first_moment / (1 - beta1 ** i)
second_unbias = second_moment / (1 - beta2 ** i)
# Update the image by following the gradient (AdaGrad/RMSProp step)
mixed_image -= step_size * first_unbias / (tf.sqrt(second_unbias) + 1e-8)
答案 1 :(得分:0)