Tensorflow梯度下降优化器给出nan结果

时间:2019-04-04 22:07:59

标签: python-3.x tensorflow

我正试图在单位圆上找到相互远离的点,作为张量流的介绍。我一直在得到结果,我也不知道为什么。

我发现我的算法有几个问题,并已解决。我曾尝试将学习率以及alpha和beta的数量级更改,但仍然存在相同的问题。

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

x = tf.Variable(tf.random_uniform([4, 1], -1.0, 1.0, dtype=tf.float32, seed=42), dtype=tf.float32, name="x")
y = tf.Variable(tf.random_uniform([1, 4], -1.0, 1.0, dtype=tf.float32, seed=43), dtype=tf.float32, name="y")

alpha = tf.constant(0.5, dtype=tf.float32, name='alpha')
beta = tf.constant(-0.25, dtype=tf.float32, name='beta')

# Error function for "on the circle"
errA = alpha * tf.math.reduce_mean(tf.abs((x*x) + (y*y) -1))

# Error function for "far apart"
# I'm sure there's a better way to do this but I couldn't figure out what it is

xstack = tf.stack([x,x,x,x], axis=0)
xt = tf.transpose(x)
yt = tf.transpose(y)
ystack = tf.stack([yt,yt,yt,yt], axis=0)
xtstack = tf.stack([xt,xt,xt,xt], axis=1)
ytstack = tf.stack([y,y,y,y], axis=1)

xs = xstack - xtstack
ys = ystack - ytstack

errB = beta * tf.math.reduce_mean(tf.math.sqrt(xs*xs + ys*ys))

n_epochs = 1000
learning_rate = 0.01

error = errA + errB

mse = tf.reduce_mean(tf.square(error), name="mse")

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
        sess.run(training_op)

    best_x = x.eval()
    best_y = y.eval()

我希望看到MSE下降以及最终的最佳x和y。实际输出是

Epoch 0 MSE = 0.0034612082
Epoch 100 MSE = nan
Epoch 200 MSE = nan
Epoch 300 MSE = nan
Epoch 400 MSE = nan
Epoch 500 MSE = nan
Epoch 600 MSE = nan
Epoch 700 MSE = nan
Epoch 800 MSE = nan
Epoch 900 MSE = nan

,最好的x和y也都是nan。运行优化器后,一旦它变为nan。

0 个答案:

没有答案