Question

我正在尝试使用keras的功能API构建递归神经网络，但是遇到了有关输出形状的一些问题，我们将不胜感激。

我的代码：

import tensorflow as tf
from tensorflow.python.keras.datasets import mnist
from tensorflow.python.keras.layers import Dense, CuDNNLSTM, Dropout
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.utils import normalize
from tensorflow.python.keras.utils import np_utils

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = normalize(x_train, axis=1), normalize(x_test, axis=1)

y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)

feature_input = tf.keras.layers.Input(shape=(28, 28))
x = tf.keras.layers.CuDNNLSTM(128, kernel_regularizer=tf.keras.regularizers.l2(l=0.0004), return_sequences=True)(feature_input)
y = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs=feature_input, outputs=y)
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(optimizer=opt, loss="sparse_categorical_crossentropy", metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3, validation_data=(x_test, y_test))

错误：

ValueError：检查目标时出错：预期密度为3维，但数组的形状为（60000，10）

Answer 1

您的数据（目标）的形状为(60000, 10)。

您模型的输出（“密集”）的形状为(None, length, 10)。

其中None是批处理大小（变量），length是中间维度，表示LSTM的“时间步长”，而10是Dense层的单位

现在，您没有在LSTM中处理时间序列的任何顺序，这没有任何意义。它将“图像行”解释为连续的时间步长，将“图像列”解释为独立的特征。（如果这不是您的意图，那么您就很幸运，它没有给您尝试将图像放入LSTM的错误）

无论如何，您可以使用return_sequences=False来解决此错误（丢弃序列的length）。这并不意味着该模型对于这种情况是最佳的。

keras，采用RNN模型的MNIST分类，关于输出形状的问题

1 个答案: