TensorFlow ValueError:无法为形状为'(?,100)'的Tensor'InputData / X:0'输入形状(32,2)的值

时间:2018-10-15 17:35:43

标签: python tensorflow sentiment-analysis training-data valueerror

我是TensorFlow和机器学习的新手。我正在尝试使用张量流创建情感分析神经网络。

我已经设置好架构,正在尝试训练模型,但是遇到错误

  

ValueError:无法为张量为'(?,100)'的张量'InputData / X:0'输入形状(32,2)的值

我认为该错误与我的输入“ layer net = tflearn.input_data([None,100])”有关。 我遵循的教程建议此输入形状,批处理大小为“无”,长度为100,因为这是序列长度。因此,(无,100),据我所知,这是输入到网络中的训练数据所需的维度,对吗?

有人可以解释为什么建议的批量大小输入形状为何为None,以及为什么Tensor流试图馈入放置了形状的网络(32,2)。 2的序列长度从何而来?

如果我对这种解释的理解有误,请随时纠正我,我仍在尝试学习该理论。

预先感谢

In [1]:

import tflearn
from tflearn.data_utils import to_categorical, pad_sequences
from tflearn.datasets import imdb

In [2]:

#Loading IMDB dataset
train, test, _ = imdb.load_data(path='imdb.pkl', n_words=10000,
                                valid_portion=0.1)
trainX, trainY = train
testX, testY = test

In [3]:

#Data sequence padding 
trainX = pad_sequences(trainX, maxlen=100, value=0.)  
testX = pad_sequences(testX, maxlen=100, value=0.)
#converting labels of each review to vectors
trainY = to_categorical(trainY, nb_classes=2)
trainX = to_categorical(testY, nb_classes=2)


In [4]:

#network building 
net = tflearn.input_data([None, 100])
net = tflearn.embedding(net, input_dim=10000, output_dim=128)
net = tflearn.lstm(net, 128, dropout = 0.8)
net = tflearn.fully_connected(net, 2, activation='softmax') 
net = tflearn.regression(net, optimizer = 'adam', learning_rate=0.0001,
                         loss='categorical_crossentropy')


WARNING:tensorflow:From C:\Users\Nason\Anaconda33\envs\TensorFlow1.8CPU\lib\site-packages\tflearn\objectives.py:66: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead


In [5]:

#Training
model = tflearn.DNN(net, tensorboard_verbose=0)   #train using tensorflow Deep nueral net
model.fit(trainX, trainY, validation_set=(testX, testY), show_metric=True,    #fit launches training process for training and validation data, metric displays data as its training.
          batch_size=32)


---------------------------------
Run id: U7NONK
Log directory: /tmp/tflearn_logs/
INFO:tensorflow:Summary name Accuracy/ (raw) is illegal; using Accuracy/__raw_ instead.
---------------------------------
Training samples: 2500
Validation samples: 2500
--

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-7ffd0a8836f9> in <module>()
      2 model = tflearn.DNN(net, tensorboard_verbose=0)   #train using tensorflow Deep nueral net
      3 model.fit(trainX, trainY, validation_set=(testX, testY), show_metric=True,    #fit launches training process for training and validation data, metric displays data as its training.
----> 4           batch_size=32)

~\Anaconda33\envs\TensorFlow1.8CPU\lib\site-packages\tflearn\models\dnn.py in fit(self, X_inputs, Y_targets, n_epoch, validation_set, show_metric, batch_size, shuffle, snapshot_epoch, snapshot_step, excl_trainops, validation_batch_size, run_id, callbacks)
    214                          excl_trainops=excl_trainops,
    215                          run_id=run_id,
--> 216                          callbacks=callbacks)
    217 
    218     def fit_batch(self, X_inputs, Y_targets):

~\Anaconda33\envs\TensorFlow1.8CPU\lib\site-packages\tflearn\helpers\trainer.py in fit(self, feed_dicts, n_epoch, val_feed_dicts, show_metric, snapshot_step, snapshot_epoch, shuffle_all, dprep_dict, daug_dict, excl_trainops, run_id, callbacks)
    337                                                        (bool(self.best_checkpoint_path) | snapshot_epoch),
    338                                                        snapshot_step,
--> 339                                                        show_metric)
    340 
    341                             # Update training state

~\Anaconda33\envs\TensorFlow1.8CPU\lib\site-packages\tflearn\helpers\trainer.py in _train(self, training_step, snapshot_epoch, snapshot_step, show_metric)
    816         tflearn.is_training(True, session=self.session)
    817         _, train_summ_str = self.session.run([self.train, self.summ_op],
--> 818                                              feed_batch)
    819 
    820         # Retrieve loss value from summary string

~\Anaconda33\envs\TensorFlow1.8CPU\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
    898     try:
    899       result = self._run(None, fetches, feed_dict, options_ptr,
--> 900                          run_metadata_ptr)
    901       if run_metadata:
    902         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~\Anaconda33\envs\TensorFlow1.8CPU\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1109                              'which has shape %r' %
   1110                              (np_val.shape, subfeed_t.name,
-> 1111                               str(subfeed_t.get_shape())))
   1112           if not self.graph.is_feedable(subfeed_t):
   1113             raise ValueError('Tensor %s may not be fed.' % subfeed_t)

ValueError: Cannot feed value of shape (32, 2) for Tensor 'InputData/X:0', which has shape '(?, 100)'

3 个答案:

答案 0 :(得分:0)

错误来自trainX = to_categorical(testY, nb_classes=2)。这需要更改为testY = to_categorical(testY, nb_classes=2)

此外,将批次大小设置为None意味着它应该期望该批次为任何大小。在您的情况下,请将批量大小设置为32,因此也可以将输入形状设置为[32, 100]

答案 1 :(得分:0)

您将trainX的类别数保留为2,但是您的模型期望使用100

编辑:

我只是注意到您在这段代码中将trainX设置为testY

trainX = to_categorical(testY, nb_classes=2)

应为:

trainX = to_categorical(trainX, nb_classes=100)

因此,您需要将代码更改为:

#Data sequence padding
trainX = pad_sequences(trainX, maxlen=100, value=0.)  
testX = pad_sequences(testX, maxlen=100, value=0.)
#converting labels of each review to vectors
trainY = to_categorical(trainY, nb_classes=2)
#change the number of Classes
trainX = to_categorical(trainX, nb_classes=100) #CHANGE HERE!!

进行此更改后,您会没事的。我刚刚测试过,就可以了!

可以使用[None,100]设置输入的形状,这样可以在以后需要时提供更大的灵活性来更改批量大小!

答案 2 :(得分:0)

tflearn.input_data([None, 100])

您期望输入是具有100个要素的任意数量实例的张量。

trainX = pad_sequences(trainX, maxlen=100, value=0.)  
testX = pad_sequences(testX, maxlen=100, value=0.)
#converting labels of each review to vectors
trainY = to_categorical(trainY, nb_classes=2)
trainX = to_categorical(testY, nb_classes=2) #HEREEEEEE

这在您的代码中有问题。您正在将trainX重置为具有其他形状,而不是填充的形状。我想你的意思是:

testY = to_categorical(testY, nb_classes=2)

如果这仍然行不通。

我怀疑您缺少数据的重塑。实际上,您确实在使用填充,但是在整个trainX,trainY等上。尝试分别填充每个“行”。然后,每个实例的长度将与您期望的一样为“ 100”。

在此之前,打印张量的形状(如print(trainX.shape)),以查看您是否真的在预处理数据(我还建议做两个脚本,一个脚本进行整体加载,预处理,重塑和填充,以及其他带有tensorFlow逻辑)