链接数据内容时的Tensorflow值错误-无法为Tensor'Placeholder_1:0'输入形状(1,1)的值,

时间:2018-08-10 12:59:11

标签: tensorflow lstm

此帖子与following问题有关。上面的代码取自可接受的答案。

程序本身可以正常工作,但是如果我仅更改

提供的数据的值
df = pd.DataFrame({'Temperature': [183, 10.7, 24.3, 10.7],
                   'Weight': [8, 11.2, 14, 11.2],
                   'Size': [3.97, 7.88, 11, 7.88],
                   'Property': [0,1,2,0]})

df = pd.DataFrame({'Temperature': [0,0,0,0],
                   'Weight': [1,2,3,4],
                   'Size': [1,2,3,4],
                   'Property': [1,1,1,1]})

我在执行代码时收到以下错误

  

ValueError:无法输入张量的形状(1,1)的值   '占位符_1:0',其形状为'(?,3)'

在结构上什么都没有改变,所以这个错误让我感到很困惑。奇怪的是,更改数据值可能会也可能不会触发此问题。我尝试了各种TF版本,包括最新版本,并且始终会发生同一问题。

有人知道我在想什么吗?完整的代码示例如下。

import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder

df = pd.DataFrame({'Temperature': [183, 10.7, 24.3, 10.7],
                   'Weight': [8, 11.2, 14, 11.2],
                   'Size': [3.97, 7.88, 11, 7.88],
                   'Property': [0,1,2,0]})


df.Property = df.Property.shift(-1)
print ( df.head() )
# parameters
time_steps = 1
inputs = 3
outputs = 3

df = df.iloc[:-1,:]

df = df.values

train_X = df[:, 1:]
train_y = df[:, 0]

scaler = MinMaxScaler(feature_range=(0, 1))    
train_X = scaler.fit_transform(train_X)

train_X = train_X[:,None,:]

onehot_encoder = OneHotEncoder()
encode_categorical = train_y.reshape(len(train_y), 1)
train_y = onehot_encoder.fit_transform(encode_categorical).toarray()

learning_rate = 0.001
epochs = 500
batch_size = int(train_X.shape[0]/2)
length = train_X.shape[0]
display = 100
neurons = 100

tf.reset_default_graph()

X = tf.placeholder(tf.float32, [None, time_steps, inputs])
y = tf.placeholder(tf.float32, [None, outputs])

cell = tf.contrib.rnn.BasicLSTMCell(num_units=neurons, activation=tf.nn.relu)
cell_outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)

stacked_outputs = tf.reshape(cell_outputs, [-1, neurons])
out = tf.layers.dense(inputs=stacked_outputs, units=outputs)

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(
            labels=y, logits=out))

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)

accuracy = tf.metrics.accuracy(labels =  tf.argmax(y, 1),
                                  predictions = tf.argmax(out, 1),
                                                            name = "accuracy")
precision = tf.metrics.precision(labels=tf.argmax(y, 1),
                                         predictions=tf.argmax(out, 1),
                                                                          name="precision")
recall = tf.metrics.recall(labels=tf.argmax(y, 1),
                                   predictions=tf.argmax(out, 1),
                                                              name="recall")
f1 = 2 * accuracy[1] * recall[1] / ( precision[1] + recall[1] )

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    tf.local_variables_initializer().run()

    for steps in range(epochs):
        mini_batch = zip(range(0, length, batch_size),
                   range(batch_size, length+1, batch_size))

        for (start, end) in mini_batch:
            sess.run(training_op, feed_dict = {X: train_X[start:end,:,:],
                                               y: train_y[start:end,:]})

        if (steps+1) % display == 0:
            loss_fn = loss.eval(feed_dict = {X: train_X, y: train_y})
            print('Step: {}  \tTraining loss: {}'.format((steps+1), loss_fn))

    acc, prec, recall, f1 = sess.run([accuracy, precision, recall, f1],
                                     feed_dict = {X: train_X, y: train_y})

    print('\nEvaluation  on training set')
    print('Accuracy:', acc[1])
    print('Precision:', prec[1])
    print('Recall:', recall[1])
    print('F1 score:', f1)

2 个答案:

答案 0 :(得分:1)

这里提供的是一个分类网络:它接受输入或特征(温度重量尺寸)并将它们分类进入您的类之一:0、1或2。( Property 字段)

修改原始数据集时,修改了类数:从3 (0,1,2)到1。(1)

为使代码正常工作,您只需要修改代码的parameters部分,使其适合您的数据集。

# parameters
time_steps = 1
inputs = 3
outputs = 1

注意:在这种情况下,我发现术语outputs有点含糊。我会用类似nb_classes

的东西

答案 1 :(得分:1)

正如@Lescurel正确指出的那样,在分类设置中,变量output应该反映目标变量中的类数。

而在回归设置中,它将反映目标变量的列数(假设我们预测了多个变量)。

给定样本输入数据:

df = pd.DataFrame({'Temperature': [1,2,3,4,5],
                   'Weight': [2,4,6,8,10],
                   'Size': [9,24,9,9,9],
                   'Property': [0,0,0,0,1]})

目标类的数量为2。因此output = 2

注意:您在https://paste.ubuntu.com/p/tmXgQfm8GB/中发布的代码对我来说效果很好。

刚刚发现目标变量Property是DataFrame的最后一列。

   Temperature  Weight  Size  Property
0            1       2     9       0.0
1            2       4    24       0.0
2            3       6     9       0.0
3            4       8     9       1.0
4            5      10     9       NaN

如下修改您的代码,而不是:

# X_y_split
train_X = df[:, 1:]
train_y = df[:, 0]

将其更改为:

# X_y_split
train_X = df[:, :-1]
train_y = df[:, -1]