Question

我一直在尝试在Pytorch中使用LSTM（在自定义模型中在LSTM之后是线性层），但是在计算损耗时遇到了以下错误：

Assertion cur_target >= 0 && cur_target < n_classes' failed.

我用以下方法定义了损失函数：

criterion = nn.CrossEntropyLoss()

然后用

调用

loss += criterion(output, target)

我给目标的尺寸为[sequence_length，number_of_classes]，输出的尺寸为[sequence_length，1，number_of_classes]。

我遵循的示例似乎在做同样的事情，但是在Pytorch docs on cross entropy loss.

上却有所不同

文档说目标应该是维度（N），其中每个值是0≤target [i]≤C-1，C是类数。我将目标更改为这种形式，但是现在我得到一个错误提示（序列长度为75，并且有55个类）：

Expected target size (75, 55), got torch.Size([75])

我尝试着寻找两种错误的解决方案，但仍然无法正常工作。我对目标的正确尺寸以及第一个错误的实际含义感到困惑（不同的搜索给出的错误含义非常不同，没有一个修复程序起作用）。

谢谢

Answer 1

您可以在squeeze()张量上使用output，这将返回一个已删除尺寸为1的所有尺寸的张量。

此短代码使用您在问题中提到的形状：

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

导致您描述的错误：

ValueError: Expected target size (75, 55), got torch.Size([75])

因此，在squeeze()张量上使用output可以通过使其形状正确来解决您的问题。

形状正确的示例：

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()

# apply squeeze() on output tensor to change shape form [75, 1, 55] to [75, 55]
loss = criterion(output.squeeze(), target)
print(loss)

输出：

tensor(4.0442)

使用squeeze()将张量形状从[75, 1, 55]更改为[75, 55]，以使输出形状和目标形状匹配！

您还可以使用其他方法来重塑张量，这很重要，您的形状应为[sequence_length, number_of_classes]而不是[sequence_length, 1, number_of_classes]。

您的目标应为LongTensor。包含类的torch.long类型的张量。形状为[sequence_length]。

编辑：
传递给交叉熵函数时，上述示例的形状：

输出：torch.Size([75, 55])
目标：torch.Size([75])

这是一个更通用的示例，CE的输出和目标应为什么样。在这种情况下，我们假设有5种不同的目标类别，对于长度为1、2和3的序列，有3个示例。

# init CE Loss function
criterion = nn.CrossEntropyLoss()

# sequence of length 1
output = torch.rand(1, 5)
# in this case the 1th class is our target, index of 1th class is 0
target = torch.LongTensor([0])
loss = criterion(output, target)
print('Sequence of length 1:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 2
output = torch.rand(2, 5)
# targets are here 1th class for the first element and 2th class for the second element
target = torch.LongTensor([0, 1])
loss = criterion(output, target)
print('\nSequence of length 2:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 3
output = torch.rand(3, 5)
# targets here 1th class, 2th class and 2th class again for the last element of the sequence
target = torch.LongTensor([0, 1, 1])
loss = criterion(output, target)
print('\nSequence of length 3:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

输出：

Sequence of length 1:
Output: tensor([[ 0.1956,  0.0395,  0.6564,  0.4000,  0.2875]]) shape: torch.Size([1, 5])
Target: tensor([ 0]) shape: torch.Size([1])
Loss: tensor(1.7516)

Sequence of length 2:
Output: tensor([[ 0.9905,  0.2267,  0.7583,  0.4865,  0.3220],
        [ 0.8073,  0.1803,  0.5290,  0.3179,  0.2746]]) shape: torch.Size([2, 5])
Target: tensor([ 0,  1]) shape: torch.Size([2])
Loss: tensor(1.5469)

Sequence of length 3:
Output: tensor([[ 0.8497,  0.2728,  0.3329,  0.2278,  0.1459],
        [ 0.4899,  0.2487,  0.4730,  0.9970,  0.1350],
        [ 0.0869,  0.9306,  0.1526,  0.2206,  0.6328]]) shape: torch.Size([3, 5])
Target: tensor([ 0,  1,  1]) shape: torch.Size([3])
Loss: tensor(1.3918)

我希望这会有所帮助！

Pytorch LSTM：计算交叉熵损失时的目标尺寸

1 个答案:

Pytorch LSTM：计算交叉熵损失时的目​​标尺寸

1 个答案:

Pytorch LSTM：计算交叉熵损失时的目标尺寸