Question

我想训练一个lstm自动编码器模型，该模型将输入x（具有[batch_size，时间戳，特征]的形状）映射到具有相同x（具有完全相同的形状）的输出。在这个问题上，我不确定应该使用哪种损失函数。

我已经尝试了使用欧式距离和MseLoss的损失函数，但是我不确定应该如何组织损失数据。例如，让我们选择欧几里德距离作为损失函数应该如何处理数据？我应该为批次的每个序列/样本计算欧几里德距离，然后对整个批次的损失求平均值吗？还是应该对序列/样本中每个时间步的损失进行求和（或求平均值），然后对批次进行取平均值？

class LSTM_FC(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers):
        super(LSTM_FC, self).__init__()
        self.in_size = input_size
        self.hidden_size = hidden_size
        self.nb_lstm_layers = num_layers
        self.dropout = nn.Dropout(0.5)
        self.relu = nn.ReLU()

        self.lstm = nn.LSTM(input_size=self.in_size, 
             hidden_size=self.hidden_size, num_layers=self.nb_lstm_layers, 
             batch_first=True)
        self.fc = nn.Linear(self.hidden_size, self.in_size)

    def forward(self, x):
        out, h_state = self.lstm(x)
        out = self.dropout(out)

        output_fc = []
        for frame in out:
            #print(frame.shape)
            output_fc.append( self.fc(frame) )
            # Fully connected layer to each frame

        output_fc_ = torch.stack(output_fc)
        output_fc_ = self.relu(output_fc_)
        return output_fc_  # The output has the same shape as the input

 ae_optimizer = optim.Adam( LSTM_FC.parameters(), lr=0.0001, 
                          weight_decay=0.0001) 
 criterion = nn.MSELoss()

 for epoch_i in trange(200, desc='Epoch', ncols=80):
      for batch_i, image_features in enumerate(tqdm(self.train_loader, 
                    desc='Batch', ncols=80, leave=False)): 
           self.ae_optimizer.zero_grad() 
           image_features = Variable(image_features).cuda()
           generated_features = LSTM_FC(image_features)

           reconstruction_loss = self.criterion(generated_features.float(), 
           reconstruction_loss .backward() # Backward pass                 
           torch.nn.utils.clip_grad_norm_(self.model.parameters(), 
                                   self.config.clip)
           self.ae_optimizer.step() # Update parameters

损失似乎有所减少，但与训练集和验证集的损失率却没有相同。我的意思是，在训练中损失确实很小，但在验证设置中，损失大约是训练损失的4-5倍。此问题是由于损失功能引起的吗？如果是，我如何为损失函数准备数据？请帮忙

LSTM自动编码器的损耗函数

0 个答案: