Question

我正在尝试实施深度学习论文（https://github.com/kiankd/corel2019），并且在向其提供真实数据（MNIST）时出现奇怪的错误，但是在使用与作者相同的综合数据时却没有错误。该函数中发生错误：

def get_armask(shape, labels, device=None):
    mask = torch.zeros(shape).to(device)
    arr = torch.arange(0, shape[0]).long().to(device)
    mask[arr, labels] = -1.
    return mask

更具体地说，此行：

mask[arr, labels] = -1.

错误是：

RuntimeError: The shape of the mask [500] at index 0 does not match the shape of the indexed tensor [500, 10] at index 1

奇怪的是，如果我使用合成数据，则没有错误，并且可以完美运行。如果我打印出形状，则会得到以下信息（包括合成数据和MNIST）：

mask torch.Size([500, 10])
arr torch.Size([500])
labels torch.Size([500])

用于生成综合数据的代码如下：

X_data = (torch.rand(N_samples, D_input) * 10.).to(device)
labels = torch.LongTensor([i % N_classes for i in range(N_samples)]).to(device)

要加载MNIST的代码是这样的：

train_images = mnist.train_images()
X_data_all = train_images.reshape((train_images.shape[0], train_images.shape[1] * train_images.shape[2]))
X_data = torch.tensor(X_data_all[:500,:]).to(device)
X_data = X_data.type(torch.FloatTensor)

labels = torch.tensor(mnist.train_labels()[:500]).to(device)

get_armask的使用方式如下：

def forward(self, predictions, labels):
    mask = get_armask(predictions.shape, labels, device=self.device)

    # make the attractor and repulsor, mask them!
    attraction_tensor = mask * predictions
    repulsion_tensor = (mask + 1) * predictions

    # now, apply the special cosine-COREL rules, taking the argmax and squaring the repulsion
    repulsion_tensor, _ = repulsion_tensor.max(dim=1)
    repulsion_tensor = repulsion_tensor ** 2

    return arloss(attraction_tensor, repulsion_tensor, self.lam)

实际错误似乎与错误消息中的内容有所不同，但我不知道在哪里查找。我尝试了一些方法，例如更改学习率，将MNIST数据标准化为与测试数据大致相同的范围，但似乎无济于事。

有什么建议吗？提前非常感谢！

Answer 1

与论文作者交换了一些电子邮件后，我们弄清了问题所在。标签是Byte类型而不是Long类型，这导致了错误。错误消息非常容易引起误解，实际问题与尺寸无关...

MNIST数据集上PyTorch中的张量形状不匹配错误，但合成数据上没有错误

1 个答案: