使用Tensor流的线性回归每次给出不同的权重和成本

时间:2018-07-28 08:50:30

标签: python tensorflow linear-regression

我正在尝试使用张量流实现线性回归。以下是我正在使用的代码。

import tensorflow as tf
import numpy as np 
import pandas as pd
import os
rng = np.random
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# reading data from a csv file
file1 = pd.read_csv('out.csv')
x_data=file1['^GSPC']
# converting datafram into array
x_data=x_data.values
y_data=file1['FB']
#converting dataframe into array
y_data=y_data.values
n_steps = 1000    #Total number of steps
n_iterations = []  #Nth iteration value
n_loss = []      #Loss at nth iteration
learned_weight = []  #weight at nth iteration
learned_bias = []  #bias value at nth iteration
# Try to find values for W and b that compute y_data = W * x_data + b
W = tf.Variable(rng.randn())
b = tf.Variable(rng.rand())
y = W * x_data + b
# Minimize the mean squared errors.
loss=tf.reduce_sum(tf.pow(y-y_data, 2))/(2*28)
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
with tf.Session() as sess:
  # Before starting, initialize the variables.  We will 'run' this first.
  sess.run(tf.global_variables_initializer())
  for step in range(n_steps):
    sess.run(train)
    n_iterations.append(step)
    n_loss.append(loss.eval())
    learned_weight.append(W.eval())
    learned_bias.append(b.eval())
print("Final Weight: "+str(learned_weight[-1])+", Final Bias: "+str(learned_bias[-1]) + ", Final cost:"+str(n_loss[-1]))

问题在于,每当我运行代码时,我都会得到不同的结果(权重,偏差和成本(损失))。我从一些资源中研究到,每次运行的权重,偏差和成本应该大致相同。 第二,线(y = weights * x_data + bias)不太适合训练数据。 第三,我必须通过实现以下内容将数据帧x_data和y_data转换为数组

x_data=x_data.values
y_data=y_data.values

如果我没有如上所示执行我的代码,则运行以下错误:

追踪(最近一次通话最近):文件“ python”,第33行,在文件“ tensorflow / python / framework / fast_tensor_util.pyx”,行120,在tensorflow.python.framework.fast_tensor_util.AppendObjectArrayToTensorProto类型错误:预期的二进制文件或unicode字符串,得到tf.tensor'sub:0'shape =(28,)dtype = float32

请帮助我了解我在做什么错!

P.S:我的问题听起来很愚蠢,因为我是张量流和机器学习的新手。

1 个答案:

答案 0 :(得分:2)

代码执行错误:

  • tf.Placeholders用于将传递到模型中的数据。
  • 执行图形时,使用feed_dict的{​​{1}}属性将数据传递到占位符。

这是一个更新的示例:

建立图表

sess.run

在会话中执行

import numpy as np
import tensorflow as tf
import numpy as np

# dataset
X_data = np.random.randn(100,3)
y_data = 2*np.sum(X_data, 1)+0.01

# reshape y to be a column vector
y_data = np.reshape(y_data, [-1, 1])

# parameters
n_steps = 1000    #Total number of steps
batch_size = 20
input_length = X_data.shape[0] # => 100
display_cost = 500

# data placeholders
X = tf.placeholder(shape=[None, 3],dtype = tf.float32)
y = tf.placeholder(shape=[None, 1],dtype = tf.float32)

# build the model 
W = tf.Variable(initial_value = tf.random_normal([3,1]))
b = tf.Variable(np.random.rand())
y_fitted = tf.add(tf.matmul(X, W), b)

# Minimize the mean squared errors
loss=tf.losses.mean_squared_error(labels=y, predictions=y_fitted)
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

2次运行的输出

# execute in Session
with tf.Session() as sess:
    # initialize all variables
    tf.global_variables_initializer().run()

    # Train the model
    for steps in range(n_steps):
        mini_batch = zip(range(0, input_length, batch_size),
                    range(batch_size, input_length+1, batch_size))

        # train data in mini-batches
        for (start, end) in mini_batch:
            sess.run(optimizer, feed_dict = {X: X_data[start:end],
                                             y: y_data[start:end]})

        # print training performance 
        if (steps+1) % display_cost == 0:
            print('Step: {}'.format((steps+1)))
            # evaluate loss function
            cost = sess.run(loss, feed_dict = {X: X_data,
                                               y: y_data})
            print('Cost: {}'.format(cost))

    # report rmse for training and test data
    print('\nFinal Weight: {}'.format(W.eval()))
    print('\nFinal Bias: {}'.format(b.eval()))

实际上,对于使用同一数据集构建分类器的多个调用,权重和偏差大致相同。同样,在进行数值计算时,Numpy # Run 1 Step: 500 Cost: 3.1569701713918263e-11 Step: 1000 Cost: 3.1569701713918263e-11 Final Weight: [[2.0000048] [2.0000024] [1.9999973]] Final Bias: 0.010000854730606079 # Run 2 Step: 500 Cost: 7.017615221566187e-12 Step: 1000 Cost: 7.017615221566187e-12 Final Weight: [[1.9999975] [1.9999989] [1.9999999]] Final Bias: 0.0099998963996768 也是首选数据格式,因此使用ndarrays进行转换。