我知道,当LSTM有可变序列长度输入时,有多种选择可遵循:(1)填充,(2)批量输入相同长度的输入,以及(3)逐一提供观察值-一(批量大小= 1)。我要问的问题与选项2和3有关。我对填充不感兴趣,并且我已经知道该怎么做。
This post创建一个生成器,以使用相同数量的相同长度序列的观测值创建批处理。它同时产生X_train
和y_train
作为元组。对于这个最小的示例,我们假设我们执行二进制分类。这就是序列标记。
#Load libraries
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, metrics
import numpy as np
from tensorflow.keras.utils import to_categorical
def train_generator():
while True:
sequence_length = np.random.randint(10, 100)
x_train = np.random.random((1000, sequence_length, 5))
# y_train will depend on past 5 timesteps of x
y_train = x_train[:, :, 0]
for i in range(1, 5):
y_train[:, i:] += x_train[:, :-i, i]
y_train = to_categorical(y_train > 2.5)
yield (x_train, y_train)
training_iterator = train_generator()
#First Batch
first_batch = next(training_iterator)
print('The shape of X_train in the first batch:{}'.format(first_batch[0].shape))
#The shape of X_train in the first batch:(100, 26, 5)
print('The shape of y_train in the first batch:{}'.format(first_batch[1].shape))
#The shape of y_train in the first batch:(100, 26, 2)
#Second Batch
second_batch = next(training_iterator)
print('The shape of X_train in the second batch:{}'.format(second_batch[0].shape))
#The shape of X_train in the second batch:(100, 94, 5)
print('The shape of X_train in the second batch:{}'.format(second_batch[1].shape))
#The shape of X_train in the second batch:(100, 94, 2)
对于建模部分,我正在使用Tensorflow 2.0,并且我有2个LSTM层和一个带有TimeDistributed
包装器的密集输出层。由于我的序列长度不同,因此对于input
,我必须选择None
作为输入的时间步长形状。
inputs = keras.Input(shape=(None,5), name='inputs')
#<tf.Tensor 'digits:0' shape=(None, None, 5) dtype=float32>
whole_sequence_output= keras.layers.LSTM(32, return_sequences=True, return_state=False)(inputs)
#<tf.Tensor 'lstm_1/Identity:0' shape=(None, None, 32) dtype=float32>
whole_sequence_output2 = keras.layers.LSTM(8, return_sequences=True, return_state=False)(whole_sequence_output)
#whole_sequence_output2
#<tf.Tensor 'lstm_4/Identity:0' shape=(None, None, 16) dtype=float32>
outputs = layers.TimeDistributed(layers.Dense(2, activation='sigmoid', name='predictions'))(whole_sequence_output2)
#<tf.Tensor 'time_distributed_1/Identity:0' shape=(None, None, 2) dtype=float32>
model = keras.Model(inputs=inputs, outputs=outputs)
model.summary()
我使用具有二进制交叉熵损失函数的Adam优化器。
model.compile(optimizer=keras.optimizers.Adam(), # Optimizer
# Loss function to minimize
loss=keras.losses.BinaryCrossentropy(),
# List of metrics to monitor
metrics=['accuracy'])
在Tensorflow 2.0中,model.fit()
函数支持生成器。为了简单起见,我将模型运行10个时期,每次进行32次正向/反向传递以更新模型参数
history = model.fit(train_generator(), steps_per_epoch=32, epochs=10, verbose=1)
看来我的模型可以训练。
好的,我的问题是我无法进行验证。如何在EACH时代验证该模型的代码以评估经过训练的模型?我尝试了几种方法,但无济于事。
关于此过程,我在网上找不到很多。
我的验证集是否必须与每个纪元的训练集大小相同?那里有代码片段或示例吗?