我正在尝试在环境google colab的Tensorflow中构建bi-LSTM模型。在训练过程中,模型存在一个问题:最后一层表示形状不兼容。我想知道是否有任何方法可以重塑x_train和y_train来解决此问题
MemoryCache
如果我将神经元单位的值从11更改为10,则不会出现任何错误,并且可以训练模型。但是,我希望输出为10而不是11。
ValueError: Shapes (16, 11) and (16, 10) are incompatible
# current output layer (run perfectly)
tf.keras.layers.Dense (11, activation = 'softmax')
# expected output layer (shape incompatibility)
tf.keras.layers.Dense (10, activation = 'softmax')
BATCH_SIZE设置为16。y_train和x_train的形状为:
def build_model(vocab_size, embedding_dim=64, input_length=30):
print('\nbuilding the model...\n')
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=(vocab_size + 1), output_dim=embedding_dim, input_length=input_length),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(rnn_units,return_sequences=True, dropout=0.2)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(rnn_units,return_sequences=True, dropout=0.2)),
tf.keras.layers.GlobalMaxPool1D(),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Dense(64, activation='tanh'),
# softmax output layer
tf.keras.layers.Dense(10, activation='softmax')
])
# optimizer & loss
opt = 'RMSprop' #tf.optimizers.Adam(learning_rate=1e-4)
loss = 'categorical_crossentropy'
# Metrics
metrics = ['accuracy', 'AUC','Precision', 'Recall']
# compile model
model.compile(optimizer=opt,
loss=loss,
metrics=metrics)
model.summary()
return model
x_train.shape
(800, 30)
y_train.shape
(800,)
def train(model, x_train, y_train, x_validation, y_validation,
epochs, batch_size=32, patience=5,
verbose=2, monitor_es='accuracy', mode_es='auto', restore=True,
monitor_mc='val_accuracy', mode_mc='max'):
print('\ntraining...\n')
# callback
early_stopping = tf.keras.callbacks.EarlyStopping(monitor=monitor_es,
verbose=1, mode=mode_es, restore_best_weights=restore,
min_delta=1e-3, patience=patience)
model_checkpoint = tf.keras.callbacks.ModelCheckpoint('tfjsmode.h5', monitor=monitor_mc, mode=mode_mc,
verbose=1, save_best_only=True)
# Define Tensorboard as a Keras callback
tensorboard = TensorBoard(
log_dir='./logs',
histogram_freq=1,
write_images=True
)
keras_callbacks = [tensorboard, early_stopping, model_checkpoint]
# train model
history = model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs, verbose=verbose,
validation_data=(x_validation, y_validation),
callbacks=keras_callbacks)
return history
答案 0 :(得分:1)
似乎您当前使用的标签是整数(即不是单编码的矢量)。例如,您的y
看起来像
[0, 1, 8, 9, ....] # a vector of 800 elements
有两种方法可以根据此类数据训练模型。
使用sparse_categorical_crossentropy
作为模型的损失函数
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=metrics)
使用以下方法将标签转换为一键编码
y_onehot = tf.keras.utils.to_categorical(y, num_classes=10)
,然后将模型的损失保持为categorical_crossentropy