LSTM发生ValueError: Shapes (5, 2, 3) and (5, 3) is incompatible

时间:2021-06-21 02:23:44

标签: python tensorflow machine-learning keras lstm

我想用时间序列数据做时间序列多类分类。这里我得到的数据集需要大量预处理,这只是为了了解如何实现模型我使用了 IRIS 数据集(不适用于 LSTM),因为它具有与时间序列数据完全相同的结构我有(4 个输入特征,1 个输出特征,120 个样本)。我实现了以下代码,但在拟合批量大小为 5 的模型时,它导致无效形状错误(多次更改批量大小但似乎没有进行任何更改)

#load dataset
    dataframe = pandas.read_csv("iris.csv",header=None)
    dataset = dataframe.values
    X=dataset[:,0:4].astype(float)
    Y=dataset[:,4]
# Encode the output variables
    encoder = LabelEncoder()
    encoder.fit(Y)
    # convert output variables into the numbers
    encoded_Y = encoder.transform(Y)
    # Convert integers to dummy variables (one-hot encoded)
    dummy_Y = np_utils.to_categorical(encoded_Y)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,dummy_Y,test_size=0.2) #20% is allocated for the testing
X_train = X_train.reshape(60, 2, 4)
y_train = y_train.reshape(60, 2, 3)
y_train.shape,X_train.shape
<块引用>

((60, 2, 3), (60, 2, 4))


 # Create the Neural Network Model
def create_nn_model():
#create sequential model
  model = Sequential()
  model.add(LSTM(100,dropout=0.2, input_shape=(X_train.shape[1],X_train.shape[2])))
  model.add(Dense(100, activation='relu'))
  model.add(Dense(3,activation='softmax'))
  # Compile model
  model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
  return model
model = create_nn_model()
model.summary()

> Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_1 (LSTM)                (None, 100)               42000     
_________________________________________________________________
dense_2 (Dense)              (None, 100)               10100     
_________________________________________________________________
dense_3 (Dense)              (None, 3)                 303       
=================================================================
Total params: 52,403
Trainable params: 52,403
Non-trainable params: 0
model.fit(X_train,y_train,epochs=200,batch_size=5)

> ValueError                                Traceback (most recent call last)

<ipython-input-26-0aef33c299f0> in <module>()
----> 1 model.fit(X_train,y_train,epochs=200,batch_size=5) #X_train is independant variables. based on the amount of the data set data set will be trained by breaking into batches

9 frames

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    984           except Exception as e:  # pylint:disable=broad-except
    985             if hasattr(e, "ag_error_metadata"):
--> 986               raise e.ag_error_metadata.to_exception(e)
    987             else:
    988               raise

ValueError: in user code:

    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:830 train_function  *
        return step_function(self, iterator)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:813 run_step  *
        outputs = model.train_step(data)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:771 train_step  *
        loss = self.compiled_loss(
    /usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py:201 __call__  *
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    /usr/local/lib/python3.7/dist-packages/keras/losses.py:142 __call__  *
        losses = call_fn(y_true, y_pred)
    /usr/local/lib/python3.7/dist-packages/keras/losses.py:246 call  *
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper  **
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/keras/losses.py:1631 categorical_crossentropy
        y_true, y_pred, from_logits=from_logits)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/keras/backend.py:4827 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/tensor_shape.py:1161 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (5, 2, 3) and (5, 3) are incompatible

1 个答案:

答案 0 :(得分:1)

您的 y_truey_pred 的形状不同。您可能需要通过以下方式定义您的 LSTM

model.add(LSTM(100,dropout=0.2, input_shape=(2,4), return_sequences=True))
....
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
....
dense_3 (Dense)              (None, 2, 3)              303        < ---
=================================================================

更新

使用 return_sequences = True 会起作用,因为您以这种方式定义了 Training-Paris:

X_train = X_train.reshape(60, 2, 4)
y_train = y_train.reshape(60, 2, 3)

代表(batch_size, timestep, input_lenght);但请注意,您需要重塑或满足上述模型中 LSTM 层的输入要求,而不是 y_train。然而,当你定义你的模型时,你不使用返回序列,它使最后一层只有三个没有时间步的分类器,但你的 y_train 是以这种方式定义的。但是,如果您将返回序列设置为 True 并绘制模型摘要,您会看到最后一层的输出形状为 (None, 2, 3),与 y_train 的形状完全匹配。

在了解 return_sequence 在这里做什么之前,您可能需要了解时间步长在 LSTM 模型中的含义,请查看 this 答案。 AFAIK,这取决于您需要为输入设置多少个时间步;我可以使 LSTM 单元出现一次或多次(n-th 时间步长)。对于 n-th 时间步长 (n: {1,2,3..N),如果我想从 LSTM 返回所有时间步长输出(n 数字),那么我将设置 return_sequence = True,否则 {{1 }}。来自doc

<块引用>

return_sequences:布尔值。是否返回最后的输出。在输出序列或完整序列中。默认值:假。

简而言之,如果设置为 True,则所有序列都会返回,但如果设置为 False,则只有最后一个输出会返回。例如:

return_sequence = False

这是对上述代码的一种单向方法。虹膜数据取自 here

inputs = tf.random.normal([32, 8])
inputs = tf.reshape(inputs, [-1, 2, 4 ]) # or [-1, 4, 2] # or [-1, 1, 8]
inputs.shape 
TensorShape([32, 2, 4]) # (batch_size, timestep, input_length)

lstm = tf.keras.layers.LSTM(10, return_sequences=True)
whole_seq_output = lstm(inputs)
print(whole_seq_output.shape)
(32, 2, 10) # (batch_size, timestep, output_length)

lstm = tf.keras.layers.LSTM(10, return_sequences=False)
last_seq_output = lstm(inputs)
print(last_seq_output.shape)
(32, 10) # (batch_size, output_length)
import pandas 
dataframe = pandas.read_csv("/content/iris.csv")
dataframe.head(3)

  sepal.length  sepal.width petal.length    petal.width   variety
0   5.1              3.5         1.4             0.2      Setosa
1   4.9              3.0         1.4             0.2      Setosa
2   4.7              3.2         1.3             0.2      Setosa
dataframe.variety.unique()
array(['Setosa', 'Versicolor', 'Virginica'], dtype=object)
target_map = dict(zip(list(dataframe['variety'].unique()), 
                     ([0, 1, 2])))
target_map
{'Setosa': 0, 'Versicolor': 1, 'Virginica': 2}
dataframe['target'] = dataframe.variety.map(target_map) 
dataframe.sample()
    sepal.length    sepal.width petal.length  petal.width   variety   target
128      6.4             2.8       5.6           2.1       Virginica    2
X = dataframe.iloc[:, :4] 
Y = dataframe.iloc[:, 5]

X.shape, Y.shape
((150, 4), (150,))

模型

from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split

OHE_Y = to_categorical(Y, num_classes=3)
X_train, X_test, y_train, y_test = train_test_split(X, OHE_Y, 
                                                      test_size=0.2)

X_train.shape
(120, 4)

# make it lstm compatible input 
X_train = X_train.values.reshape(-1, 1, 4)

X_train.shape ,y_train.shape
((120, 1, 4), (120, 3))

推理

from tensorflow.keras import Sequential 
from tensorflow.keras.layers import LSTM, Dense 

def create_nn_model():
  model = Sequential()
  model.add(LSTM(100, dropout=0.2, input_shape=(X_train.shape[1],
                                               X_train.shape[2])))
  model.add(Dense(100, activation='relu'))
  model.add(Dense(3,activation='softmax'))
  model.compile(loss='categorical_crossentropy',
                optimizer='adam', metrics=['accuracy'])
  return model

model = create_nn_model()
model.summary()

model.fit(X_train, y_train, epochs=10,batch_size=5)

...
Epoch 9/10
3ms/step - loss: 0.5224 - accuracy: 0.7243
Epoch 10/10
3ms/step - loss: 0.5568 - accuracy: 0.7833