如何解决pytorch中的尺寸不匹配错误?

时间:2020-06-02 15:56:56

标签: python machine-learning deep-learning pytorch

我正在尝试通过在PyTorch中使用CIFAR10数据来创建物流模型。运行模型进行评估后,我遇到了错误:

RuntimeError:大小不匹配,m:[750 x 4096],m2:[1024 x 10],位于C:\ w \ 1 \ s \ tmp_conda_3.7_100118 \ conda \ conda-bld \ pytorch_1579082551706 \ work \ aten \ src \ TH / generic / THTensorMath.cpp:136

似乎input_size产生了问题,我不知道我是新来的。请让我知道我应该进行哪些更改以克服此错误。

这些是超参数:

batch_size = 100
learning_rate = 0.001

# Other constants
input_size = 4*4*64
num_classes = 10

这是将数据集下载并拆分为训练,验证和测试的单元。

transform = torchvision.transforms.Compose(
    [torchvision.transforms.ToTensor(),
     torchvision.transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))])

testset = torchvision.datasets.CIFAR10(root='D:\PyTorch\cifar-10-python', train=False,download=False, transform=transform)
trainvalset = torchvision.datasets.CIFAR10(root='D:\PyTorch\cifar-10-python', train=True,download=False, transform=transform)
trainset, valset = torch.utils.data.random_split(trainvalset, [45000, 5000]) # 10% for validation

train_loader = torch.utils.data.DataLoader(trainset, batch_size=50, shuffle=True)
test_loader = torch.utils.data.DataLoader(testset, batch_size=1000, shuffle=False)
val_loader = torch.utils.data.DataLoader(valset, batch_size=1000, shuffle=False)

这是我模型的架构。

class CifarModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(input_size,  num_classes)
    def forward(self, xb):
        xb = xb.view(-1, 64*8*8)
        #xb = xb.reshape(-1, 784)
        print(xb.shape)
        out = self.linear(xb)
        return out

    def training_step(self, batch):
        images, labels = batch 
        out = self(images)                  # Generate predictions
        loss = F.cross_entropy(out, labels) # Calculate loss
        return loss

    def validation_step(self, batch):
        images, labels = batch 
        out = self(images)                    # Generate predictions
        loss = F.cross_entropy(out, labels)   # Calculate loss
        acc = accuracy(out, labels)           # Calculate accuracy
        return {'val_loss': loss.detach(), 'val_acc': acc.detach()}

    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        batch_accs = [x['val_acc'] for x in outputs]
        epoch_acc = torch.stack(batch_accs).mean()      # Combine accuracies
        return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()}

    def epoch_end(self, epoch, result):
        print("Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}".format(epoch, result['val_loss'], result['val_acc']))

model = CifarModel()
def accuracy(outputs, labels):
    _, preds = torch.max(outputs, dim=1)
    return torch.tensor(torch.sum(preds == labels).item() / len(preds))
def evaluate(model, val_loader):
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
    history = []
    optimizer = opt_func(model.parameters(), lr)
    for epoch in range(epochs):
        # Training Phase 
        for batch in train_loader:
            loss = model.training_step(batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        model.epoch_end(epoch, result)
        history.append(result)
    return history
evaluate(model, val_loader)

1 个答案:

答案 0 :(得分:0)

您在此处指定输出类的数量应为10:

num_classes = 10

您的前进功能不反映这一点:

xb = xb.view(-1, 64*8*8) # you get 750x4096
out = self.linear(xb) # here an input of 
# input_size to linear layer = 4*4*64 # 1024
# num_classes = 10 

像这样修改它:

xb = xb.view(-1, 64*4*4) # you get 750x1024
out = self.linear(xb) # M1 750x1024 M2 1024x10:
# input_size = 4*4*64 # 1024
# num_classes = 10