Question

我是pytorch的新手，我遵循了有关使用RNN进行句子生成的教程，我试图对其进行修改以生成位置序列，但是我在定义正确的模型参数（例如input_size，output_size）时遇到了麻烦，hidden_dim，batch_size。

背景：我有596个x，y位置序列，每个序列看起来像[[x1，y1]，[x2，y2]，...，[xn，yn]]。每个序列代表车辆的2D路径。我想训练一个模型，该模型在给定起点（或部分序列）的情况下，可以生成这些序列之一。

-我已经填充/截断了序列，使它们的长度均为50，这意味着每个序列都是[50,2]形状的数组

-然后我将这些数据分为input_seq和target_seq：

input_seq：torch.Size（[596，49，2]）的张量。包含所有596个序列，每个序列都没有最后一个位置。

target_seq：torch.Size（[596，49，2]）的张量。包含所有596个序列，每个序列都没有其第一个位置。

模型类：

class Model(nn.Module):
def __init__(self, input_size, output_size, hidden_dim, n_layers):
    super(Model, self).__init__()
    # Defining some parameters
    self.hidden_dim = hidden_dim
    self.n_layers = n_layers
    #Defining the layers
    # RNN Layer
    self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
    # Fully connected layer
    self.fc = nn.Linear(hidden_dim, output_size)

def forward(self, x):
    batch_size = x.size(0)      
    # Initializing hidden state for first input using method defined below
    hidden = self.init_hidden(batch_size)
    # Passing in the input and hidden state into the model and obtaining outputs
    out, hidden = self.rnn(x, hidden)
    # Reshaping the outputs such that it can be fit into the fully connected layer
    out = out.contiguous().view(-1, self.hidden_dim)
    out = self.fc(out)        
    return out, hidden

def init_hidden(self, batch_size):
    # This method generates the first hidden state of zeros which we'll use in the forward pass
    # We'll send the tensor holding the hidden state to the device we specified earlier as well
    hidden = torch.zeros(self.n_layers, batch_size, self.hidden_dim)
    return hidden

我使用以下参数实例化模型：

input_size为2（[x，y]位置）

output_size为2（[x，y]位置）

hidden_dim为2（[x，y]位置）（还是应该像完整序列的长度那样为50？）

model = Model(input_size=2, output_size=2, hidden_dim=2, n_layers=1)
n_epochs = 100
lr=0.01
# Define Loss, Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)

# Training Run
for epoch in range(1, n_epochs + 1):
    optimizer.zero_grad() # Clears existing gradients from previous epoch
    output, hidden = model(input_seq)
    loss = criterion(output, target_seq.view(-1).long())
    loss.backward() # Does backpropagation and calculates gradients
    optimizer.step() # Updates the weights accordingly
    if epoch%10 == 0:
        print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
        print("Loss: {:.4f}".format(loss.item()))

我运行训练循环时，它失败并显示以下错误：

ValueError                                Traceback (most recent call last)
<ipython-input-9-ad1575e0914b> in <module>
      3     optimizer.zero_grad() # Clears existing gradients from previous epoch
      4     output, hidden = model(input_seq)
----> 5     loss = criterion(output, target_seq.view(-1).long())
      6     loss.backward() # Does backpropagation and calculates gradients
      7     optimizer.step() # Updates the weights accordingly
...

ValueError: Expected input batch_size (29204) to match target batch_size (58408).

我尝试修改input_size，output_size，hidden_dim和batch_size并重塑张量，但尝试得越多，我就会感到困惑。有人可以指出我在做什么错吗？

此外，由于批次大小在Model.forward（self，x）中定义为x.size（0），这意味着我只有一个批次的大小596，对吗？拥有多个较小批次的正确方法是什么？

Answer 1

output的大小为 [batch_size * seq_len，2] = [29204，2] ，然后展平大小为 [batch_size]的target_seq * seq_len * 2] = [58408] 。它们的维数不同，而元素总数相同，因此第一维不相同。

无论维数不匹配，nn.CrossEntropyLoss是一个分类损失函数，这意味着它将仅根据输出预测一个类。您没有任何类，但是您正在尝试预测坐标，它们是连续的值。为此，您需要使用回归损失函数，例如nn.MSELoss，该函数可以计算预测坐标与目标坐标之间的平方误差/距离。

criterion = nn.MSELoss()

# .flatten() does the same thing as .view(-1) but is more descriptive
loss = criterion(output.flatten(), target_seq.flatten())

由于损失函数以及线性层可以在多维输入上运行，因此可以避免扁平化，这消除了因扁平化和恢复尺寸而丢失的潜在风险，并且输出更易于检查或稍后在培训之外使用。对于线性层，仅输入的最后一个维度需要匹配nn.Linear的in_features，在您的情况下为hidden_dim。

def forward(self, x):
    batch_size = x.size(0)      
    # Initializing hidden state for first input using method defined below
    hidden = self.init_hidden(batch_size)
    # Passing in the input and hidden state into the model and obtaining outputs
    # out size: [batch_size, seq_len, hidden_dim]
    out, hidden = self.rnn(x, hidden)
    # out size: [batch_size, seq_len, output_size]
    out = self.fc(out)        
    return out, hidden

现在模型的输出与target_seq的大小相同，您可以直接调用loss函数而无需展平：

loss = criterion(output, target_seq)

hidden_dim为2（[x，y]位置）（还是应该像完整序列的长度那样为50？）

hidden_dim不是一对[x，y]，并且与input_size和output_size完全无关。它定义了RNN的隐藏特征的数量，这是它的一种复杂性，并且更大的尺寸可能具有更大的空间来保留基本信息，但也需要更多的计算。没有完美的隐藏大小，并且很大程度上取决于用例。您可以尝试不同的尺寸，例如100、256等，看看是否可以改善您的结果。

此外，由于批次大小在Model.forward（self，x）中定义为x.size（0），这意味着我只有一个批次的大小596，对吗？拥有多个较小批次的正确方法是什么？

是的，您只有一个大小为596的批次。例如，如果要使用较小的批次，例如，如果无法将所有批次都放入一个更复杂的模型中，则可以轻松地使用它们的切片，但是可以最好使用PyTorch的数据实用程序：torch.utils.data.TensorDataset获取数据集，其中输入的每个序列都有一个对应的目标，再结合torch.utils.data.DataLoader为您创建批次。

from torch.utils.data import DataLoader, TensorDataset

# Match each sequence of the input_seq to the corresponding target_seq.
# e.g. dataset[0] == (input_seq[0], target_seq[0])
dataset = TensorDataset(input_seq, target_seq)

# Randomly shuffle the data and load it in batches of 16
data_loader = DataLoader(dataset, batch_size=16, shuffle=True)

# Process one batch at a time
for input, target in data_loader:
    output, hidden = model(input)
    loss = criterion(output, target)

具有RNN的PyTorch路径生成-混淆了输入，输出，隐藏和批处理大小

1 个答案: