我想训练一个lstm自动编码器模型,该模型将输入x(具有[batch_size,时间戳,特征]的形状)映射到具有相同x(具有完全相同的形状)的输出。在这个问题上,我不确定应该使用哪种损失函数。
我已经尝试了使用欧式距离和MseLoss的损失函数,但是我不确定应该如何组织损失数据。例如,让我们选择欧几里德距离作为损失函数应该如何处理数据?我应该为批次的每个序列/样本计算欧几里德距离,然后对整个批次的损失求平均值吗?还是应该对序列/样本中每个时间步的损失进行求和(或求平均值),然后对批次进行取平均值?
class LSTM_FC(nn.Module):
def __init__(self, input_size, hidden_size, num_layers):
super(LSTM_FC, self).__init__()
self.in_size = input_size
self.hidden_size = hidden_size
self.nb_lstm_layers = num_layers
self.dropout = nn.Dropout(0.5)
self.relu = nn.ReLU()
self.lstm = nn.LSTM(input_size=self.in_size,
hidden_size=self.hidden_size, num_layers=self.nb_lstm_layers,
batch_first=True)
self.fc = nn.Linear(self.hidden_size, self.in_size)
def forward(self, x):
out, h_state = self.lstm(x)
out = self.dropout(out)
output_fc = []
for frame in out:
#print(frame.shape)
output_fc.append( self.fc(frame) )
# Fully connected layer to each frame
output_fc_ = torch.stack(output_fc)
output_fc_ = self.relu(output_fc_)
return output_fc_ # The output has the same shape as the input
ae_optimizer = optim.Adam( LSTM_FC.parameters(), lr=0.0001,
weight_decay=0.0001)
criterion = nn.MSELoss()
for epoch_i in trange(200, desc='Epoch', ncols=80):
for batch_i, image_features in enumerate(tqdm(self.train_loader,
desc='Batch', ncols=80, leave=False)):
self.ae_optimizer.zero_grad()
image_features = Variable(image_features).cuda()
generated_features = LSTM_FC(image_features)
reconstruction_loss = self.criterion(generated_features.float(),
reconstruction_loss .backward() # Backward pass
torch.nn.utils.clip_grad_norm_(self.model.parameters(),
self.config.clip)
self.ae_optimizer.step() # Update parameters
损失似乎有所减少,但与训练集和验证集的损失率却没有相同。我的意思是,在训练中损失确实很小,但在验证设置中,损失大约是训练损失的4-5倍。此问题是由于损失功能引起的吗?如果是,我如何为损失函数准备数据?请帮忙