如何在张量流中打印出预测值

时间:2019-07-05 16:09:42

标签: tensorflow

我是tensorflow的新手,学习缓慢。成功编译模型并获得准确性后,我想打印预测变量,但我不知道该怎么做。

我的数据集是只有一个输出的多元特征。输出仅包含1,0,-1,因此我为输出制作了一个热编码器。我完成了模型的编译并在张量流上在线寻找计算预测,但是根据我的问题我没有找到一个好的解决方案。

precisionCalculate函数用于计算测试数据每一列的精度,因为一次热编码后的trian_y和test_y变为[1,0,0],[0,1,0],[0,0,1]。

我尝试过

y_pred = sess.run(tf.argmax(y, 1), feed_dict={X: test_x, y: test_y})

但事实证明y_pred与我的test_y完全相同

这是我的完整代码示例。

import tensorflow as tf
import pandas as pd
import numpy as np
import tensorflow.contrib.rnn
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder, LabelEncoder
import pdb
np.set_printoptions(threshold=np.inf)


def precisionCalculate(pred_y, test_y):

    count = pred_y + test_y
    firstZero = len(count[count==0])

    countFour = len(count[count == 4])

    precision1 = firstZero / len(pred_y[pred_y==0] )

    precision3 = countFour / len(pred_y[pred_y==2])
    pdb.set_trace()
    return precision1, precision3






df = pd.read_csv('new_df.csv', skiprows=[0], header=None)
df.drop(columns=[0,1], inplace=True)
df.columns = [np.arange(0, df.shape[1])]
df[0] = df[0].shift(-1)

#parameters
time_steps = 1
inputs = df.shape[1]
outputs = 3

#remove nan as a result of shift values

df = df.iloc[:-1, :]

#convert to numpy
df = df.values

train_number = 30276 #start date from 1018
train_x = df[: train_number, 1:]
test_x = df[train_number:, 1:]
train_y = df[:train_number, 0]
test_y = df[train_number:, 0]
#data pre-processing

#x y split
#scale
scaler = MinMaxScaler(feature_range=(0,1))
train_x = scaler.fit_transform(train_x)
test_x = scaler.fit_transform(test_x)

#reshape into 3d array
train_x = train_x[:, None, :]
test_x = test_x[:, None, :]

#one-hot encode the outputs
onehot_encoder = OneHotEncoder()
#encoder = LabelEncoder()
max_ = train_y.max()
max2 = test_y.max()
train_y = (train_y - max_) * (-1)
test_y = (test_y - max2) * (-1)
encode_categorical = train_y.reshape(len(train_y), 1)
encode_categorical2 = test_y.reshape(len(test_y), 1)
train_y = onehot_encoder.fit_transform(encode_categorical).toarray()
test_y = onehot_encoder.fit_transform(encode_categorical2).toarray()

print(train_x.shape, train_y.shape, test_x.shape, test_y.shape)


#model parameters

learning_rate = 0.001
epochs = 100
batch_size = int(train_x.shape[0]/10)
length = train_x.shape[0]
display = 100
neurons = 100

tf.reset_default_graph()
X = tf.placeholder(tf.float32, [None, time_steps, 90],name='x')
y = tf.placeholder(tf.float32, [None, outputs],name='y')

#LSTM cell
cell = tf.contrib.rnn.BasicLSTMCell(num_units = neurons, activation = tf.nn.relu)
cell_outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)

# pass into Dense layer
stacked_outputs = tf.reshape(cell_outputs, [-1, neurons])
out = tf.layers.dense(inputs=stacked_outputs, units=outputs)
# squared error loss or cost function for linear regression
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=out, labels=y))

# optimizer to minimize cost
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)

accuracy = tf.metrics.accuracy(labels =  tf.argmax(y, 1), predictions = tf.argmax(out, 1), name = "accuracy")
precision = tf.metrics.precision(labels=tf.argmax(y, 1), predictions=tf.argmax(out, 1), name="precision")
recall = tf.metrics.recall(labels=tf.argmax(y, 1), predictions=tf.argmax(out, 1),name="recall")
f1 = 2 * accuracy[1] * recall[1] / ( precision[1] + recall[1] )


with tf.Session() as sess:
    # initialize all variables
    tf.global_variables_initializer().run()
    tf.local_variables_initializer().run()

    # Train the model
    for steps in range(epochs):
        mini_batch = zip(range(0, length, batch_size), range(batch_size, length+1, batch_size))
        epoch_loss = 0
        i = 0

        # train data in mini-batches
        for (start, end) in mini_batch:

            sess.run(training_op, feed_dict = {X: train_x[start:end,:,:], y: train_y[start:end,:]})


        # print training performance
        if (steps+1) % display == 0:
            # evaluate loss function on training set
            loss_fn = loss.eval(feed_dict = {X: train_x, y: train_y})
            print('Step: {}  \tTraining loss: {}'.format((steps+1), loss_fn))

    # evaluate model accuracy
    acc, prec, recall, f1 = sess.run([accuracy, precision, recall, f1],feed_dict = {X: test_x, y: test_y})
    y_pred = sess.run(tf.argmax(y, 1), feed_dict={X: train_x, y: train_y})
    test_y_alter = np.argmax(test_y, axis=1)
    #print(test_y_alter)
    print(precisionCalculate(y_pred, test_y_alter))


    print(y_pred)

    #prediction = y_pred.eval(feed_dict={X: train_x, y: test_y})
    #print(prediction)
    print('\nEvaluation  on test set')
    print('Accuracy:', acc[1])
    print('Precision:', prec[1])
    print('Recall:', recall[1])
    print('F1 score:', f1)

1 个答案:

答案 0 :(得分:0)

我认为您应该使用模型的输出,而不要使用tf.argmax中的标签(y)。

这是我的代码,用于打印模型预测:

pred_y = tf.Print(tf.argmax(score, 1), [tf.argmax(score, 1)], message="prediction:)
pred_y.eval()

在上面的代码中,得分表示模型的概率输出。