为什么我的tensorflow示例代码训练结果正在增加?

时间:2017-02-01 08:29:30

标签: python python-3.x tensorflow

嗨,我正在学习张量流。 这是我的代码,一个简单的多变量张量流示例。 运行环境是Python3.5.3,Tensorflow 0.12.1,Windows7。

import tensorflow as tf

# Input data & output data
x1_data = [1.0, 0.0, 3.0, 0.0, 5.0]
x2_data = [0.0, 2.0, 0.0, 4.0, 5.0]
y_data =  [1.0, 2.0, 3.0, 4.0, 5.0]

# W1, W2, b random generation
# W1 = 1, W2 = 1, b = 0 is ideal
W1 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
W2 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

# Our hypothesis
hypothesis = W1 * x1_data + W2 * x2_data + b
# Simplified cost function
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

# Minimize
a = tf.Variable(0.1) # Learning Rate
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

# Initialise
init = tf.global_variables_initializer()

# Launch
sess = tf.Session()
sess.run(init)

# Train loop
for step in range(10):
    sess.run(train)
    print(step, sess.run(cost), sess.run(W1), sess.run(W2), sess.run(b))

我预计结果会随着训练循环而减少。

但它会无限增加。

一个变量上的相同代码运行良好且减少了。

我不知道为什么2变量增加......

0 52.0504 [ 1.47101164] [ 2.24049234] [ 0.86718893]
1 157.129 [-1.74108529] [-1.84496927] [-0.22162986]
2 478.055 [ 4.02118969] [ 5.11457825] [ 1.86127353]
3 1457.33 [-5.99311352] [-7.13181305] [-1.60902405]
4 4445.18 [ 11.50830746] [ 14.20653534] [ 4.60829926]
5 13561.2 [-19.06884766] [-23.10119247] [-6.10722733]
6 41374.3 [ 34.32733154] [ 42.03698349] [ 12.74352837]
7 126232.0 [-58.95558929] [-71.76408386] [-20.05929375]
8 385134.0 [ 103.96767426] [ 126.9929657] [ 37.3527832]
9 1.17505e+06 [-180.62704468] [-220.19728088] [-62.82305145]

1 个答案:

答案 0 :(得分:0)

我发现的第一个解决方案是将学习率降低到0.01。看起来这些步骤正在彻底改变您的参数。可能如果你使用某种正则化技术(如L2),这种情况就不会发生。

其次,您的代码需要一些改进。使用张量流矩阵运算并将偏差初始化为零。奇怪的是,当使用TF函数进行操作时,即使0.1学习率也可以。

import tensorflow as tf
import numpy as np

# Input data & output data
x1_data = [1.0, 0.0, 3.0, 0.0, 5.0]
x2_data = [0.0, 2.0, 0.0, 4.0, 5.0]
y_data =  [1.0, 2.0, 3.0, 4.0, 5.0]

input_X = tf.Variable(np.row_stack((x1_data, x2_data)).astype(np.float32))
W = tf.Variable(tf.random_uniform([1,2], -1.0, 1.0))
b = tf.Variable(tf.zeros([1,1]))

# Our hypothesis
hypothesis = tf.add(tf.matmul(W,input_X),b)
# Simplified cost function
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

# Minimize
a = tf.Variable(0.1) # Learning Rate
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

# Initialise
init = tf.global_variables_initializer()

# Launch
sess = tf.Session()
sess.run(init)

# Train loop
for step in range(10):
    sess.run(train)
    print(step, sess.run(cost), sess.run(W), sess.run(b))