我已经查看了我的问题的所有类似答案,但我仍然没有得到它。我的模型/网络正在学习,我的MSE越来越小,但是当我使用输入数据输出预测值时,它实际上表现得越差,我的MSE越低。
在这种情况下,我输出连续数据,删除了softmax,并使用网络的最后一层来生成浮点输出。和其他人一样,我开始使用SciKit和model.predict(X)。我似乎缺少关于Tensorflow如何工作的概念。我怀疑我在下面调用“预测”的变量不是我认为的那样......那就是语句预测= sess.run(neural_network,feed_dict = {x:X})。
这是我的代码,任何帮助表示赞赏。我无法公开提供数据,但如果这个问题比我想的更深,我可以嘲笑一些。谢谢!
# Import stuff
import tensorflow as tf
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Set up some global processing variables
input_DIM = 3
output_DIM = 1
Epochs = 500
Learn_Rate = .17
model_path = "C:\AML\Models\MLPerceptron"
# Define hidden layers and neurons
n_hidden_1 = 50
n_hidden_2 = 50
n_hidden_3 = 50
n_hidden_4 = 2
# ------------------ #
# Load the test file #
# ------------------ #
def read_dataset():
df = pd.read_csv("C:\AML\data\test.csv")
# print(len(df.columns))
X = df[df.columns[0:3]].values
Y = df[df.columns[3:4]].values
return (X, Y)
X, Y = read_dataset() ;
#--------------------#
# TensorFlow Section #
#--------------------#
# Create placeholders, variables, tensors
x = tf.placeholder(tf.float32, [None, input_DIM])
W = tf.Variable(tf.zeros([input_DIM, output_DIM]))
b = tf.Variable(tf.zeros([output_DIM]))
# Diagnostic variables
cost_history = np.empty(shape=[1], dtype=float)
mse_history = []
accuracy_history = []
predicted = []
# Define the network structure
# NOTE: SoftMax activation layer removed to allow floating point output
def multilayer_perceptron(x, weights, biases):
# Hidden layer with tanh activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.tanh(layer_1)
# Hidden layer with tanh activation
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.nn.tanh(layer_2)
# Hidden layer with tanh activation
layer_3 = tf.add(tf.matmul(layer_2, weights['h3']), biases['b3'])
layer_3 = tf.nn.tanh(layer_3)
# Final layer with sigmoid activation
layer_4 = tf.add(tf.matmul(layer_3, weights['h4']), biases['b4'])
layer_4 = tf.nn.sigmoid(layer_4)
# Output layer with linear activation (softmax REMOVED)
out_layer = tf.matmul(layer_4, weights['out']) + biases['out']
return out_layer
#return layer_4
# Define the weights and the biases for each layer
weights = {
'h1': tf.Variable(tf.truncated_normal([input_DIM, n_hidden_1])),
'h2': tf.Variable(tf.truncated_normal([n_hidden_1, n_hidden_2])),
'h3': tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_3])),
'h4': tf.Variable(tf.truncated_normal([n_hidden_3, n_hidden_4])),
'out': tf.Variable(tf.truncated_normal([n_hidden_4, output_DIM]))
}
biases = {
'b1': tf.Variable(tf.truncated_normal([n_hidden_1])),
'b2': tf.Variable(tf.truncated_normal([n_hidden_2])),
'b3': tf.Variable(tf.truncated_normal([n_hidden_3])),
'b4': tf.Variable(tf.truncated_normal([n_hidden_4])),
'out': tf.Variable(tf.truncated_normal([output_DIM]))
}
# Define the perceptron network
neural_network = multilayer_perceptron(x, weights, biases)
# Create a placeholder for the expected result during training
expected = tf.placeholder(tf.float32, [None, output_DIM])
# Define TensorFlow session vars
sess = tf.Session()
init = tf.global_variables_initializer()
saver = tf.train.Saver()
# Define the loss function as squared difference of net and expected
cost_function = tf.reduce_mean(tf.square(expected - neural_network))
# Define training step using gradient descent
train_step = tf.train.GradientDescentOptimizer(Learn_Rate).minimize(cost_function)
sess.run(init)
# ---------------------------------- #
# Begin machine learning in earnest! #
# ---------------------------------- #
for i in range(Epochs):
# Execute training step
sess.run(train_step, feed_dict={x: X, expected: Y})
# Munge some diagnostics
cost = sess.run(cost_function, feed_dict={x: X, expected: Y})
cost_history = np.append(cost_history, cost)
predicted = sess.run(neural_network, feed_dict={x: X})
mse = tf.reduce_mean(tf.square(predicted - Y))
mse_ = sess.run(mse)
mse_history.append(mse_)
print('epoch : ', i, ' - ', 'cost: ', cost, " - MSE: ", mse_)
print (predicted)
更新:我认为我的问题是理解名为“预测”的张量内部的内容。下面是特征矩阵X的转储,我正在训练的Y向量,以及来自tf的预测输出。
来自numpy / SciKit世界,我期待一个列向量作为输出。但这显然不是正在产生的东西。我有一些问题:
如何提取它?我已经尝试了几件事,但是使用通常的numpy方法索引张量并没有太大的成功。
特征矩阵
[[3.05194408e-04 9.31436261e-09 2.00724503e-06 -7.80543587e-07] [3.05194408e-04 9.31436261e-09 2.00724503e-06 -7.80543587e-07] [3.66233289e-04 1.86287252e-09 1.11772351e-08 4.19146318e-07] ... [1.13227125e-02 -1.12517500e-06 -1.23229029e-06 -1.43068610e-06] [-2.84441188e-02 -1.21366156e-06 -2.33883657e-06 -3.17060903e-06] [-6.54641986e-02 -1.12983219e-06 -2.34349363e-06 -4.67581049e-06]]
目标Y矢量
[[-3.05194408e-05] [-3.05194408e-05] [3.05194408e-04] ... [9.45797488e-02] [4.90752608e-02] [1.76097173e-02]] 时期:0 - 成本:0.299485 - MSE:0.299485 时期:1 - 成本:0.278928 - MSE:0.278928 时期:2 - 成本:0.232972 - MSE:0.232972 时期:3 - 成本:0.221542 - MSE:0.221542 时期:4 - 成本:0.216305 - MSE:0.216305 时期:5 - 成本:0.211555 - MSE:0.211555 时代:6 - 成本:0.204823 - MSE:0.204823 时期:7 - 成本:0.190776 - MSE:0.190776 时期:8 - 成本:0.155517 - MSE:0.155517 时期:9 - 成本:0.136089 - MSE:0.136089
预测
[[4.07929212e-01 2.27119313e-06 1.24767032e-02 ...,1.15948327e-01 1.12259677e-02 1.41322374e-01] [4.07929212e-01 2.27119313e-06 1.24767032e-02 ...,1.15948327e-01 1.12259677e-02 1.41322374e-01] [4.07875597e-01 2.27081409e-06 1.24729313e-02 ...,1.15964599e-01 1.12368921e-02 1.41370580e-01] ... [4.01492357e-01 2.19653884e-06 1.19107403e-02 ...,1.17981538e-01 1.31033035e-02 1.47243097e-01] [4.21375126e-01 2.47630169e-06 1.40144946e-02 ...,1.13659360e-01 7.23242434e-03 1.27145886e-01] [4.32397932e-01 2.74293211e-06 1.61206387e-02 ...,1.16918713e-01 3.85677768e-03 1.13023274e-01]]