预期4D张量作为输入,取而代之的是2D张量

时间:2018-04-24 05:03:31

标签: python machine-learning neural-network conv-neural-network pytorch

我正在尝试使用Pytorch 上的预训练网络VGG16构建神经网络。

我知道我需要调整 分类器部分网络,因此我有冻结参数以防止反向传播通过他们。

代码:

    myData = arduinoSerialData.readline().strip()
  

回溯

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt
import numpy as np
import time

import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torch.autograd import Variable
from torchvision import datasets, transforms
import torchvision.models as models
from collections import OrderedDict

data_dir = 'flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'


train_transforms = transforms.Compose([transforms.Resize(224),
                                       transforms.RandomRotation(30),
                                       transforms.RandomResizedCrop(224),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       transforms.Normalize(mean=[0.485, 0.456, 0.406], 
                                                            std=[0.229, 0.224, 0.225])])



validn_transforms = transforms.Compose([transforms.Resize(224),
                                        transforms.CenterCrop(224),
                                        transforms.ToTensor(),
                                        transforms.Normalize((0.485, 0.456, 0.406), 
                                                            (0.229, 0.224, 0.225))])

test_transforms = transforms.Compose([ transforms.Resize(224),
                                       transforms.RandomResizedCrop(224),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.485, 0.456, 0.406), 
                                                            (0.229, 0.224, 0.225))])


train_data = datasets.ImageFolder(train_dir,
                                transform=train_transforms)

validn_data = datasets.ImageFolder(valid_dir,
                                transform=validn_transforms)

test_data = datasets.ImageFolder(test_dir,
                                transform=test_transforms)



trainloader = torch.utils.data.DataLoader(train_data, batch_size=32, shuffle=True)
validnloader = torch.utils.data.DataLoader(validn_data, batch_size=32, shuffle=True)
testloader = torch.utils.data.DataLoader(test_data, batch_size=32, shuffle=True)


model = models.vgg16(pretrained=True)
model


for param in model.parameters():
    param.requires_grad = False

classifier = nn.Sequential(OrderedDict([            
                          ('fc1', nn.Linear(3*224*224, 10000)), 
                          ('relu', nn.ReLU()),
                          ('fc2', nn.Linear(10000, 5000)),
                          ('relu', nn.ReLU()),
                          ('fc3', nn.Linear(5000, 102)),
                          ('output', nn.LogSoftmax(dim=1))
                          ]))

model.classifier = classifier

classifier


criterion = nn.NLLLoss()
optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)
model.cuda()

epochs = 1
steps = 0
training_loss = 0
print_every = 300
for e in range(epochs):
    model.train()
    for images, labels in iter(trainloader):
        steps == 1

        images.resize_(32,3*224*224)

        inputs = Variable(images.cuda())
        targets = Variable(labels.cuda())
        optimizer.zero_grad()

        output = model.forward(inputs)
        loss = criterion(output, targets)
        loss.backward()
        optimizer.step()

        training_loss += loss.data[0]

        if steps % print_every == 0:
            print("Epoch: {}/{}... ".format(e+1, epochs),
                  "Loss: {:.4f}".format(training_loss/print_every))

            running_loss = 0

可能是因为我在图层定义中使用线性操作?

1 个答案:

答案 0 :(得分:2)

您的网络存在两个问题 -

  1. 您创建了自己的分类器,其第一层接受大小(3 * 224 * 224)的输入,但这不是vgg16的功能部分的输出大小。功能输出张量(25088) enter image description here

  2. 您要将输入的大小调整为形状张量(3*224*224)(对于每个批次),但vgg16的要素部分需要输入(3, 224, 224)。您的自定义分类器位于这些功能之后,因此您需要为不是分类器的功能准备输入。

  3. 解决方案

    要解决第一个问题,您需要将分类器的定义更改为 -

    classifier = nn.Sequential(OrderedDict([            
                              ('fc1', nn.Linear(25088, 10000)), 
                              ('relu', nn.ReLU()),
                              ('fc2', nn.Linear(10000, 5000)),
                              ('relu', nn.ReLU()),
                              ('fc3', nn.Linear(5000, 102)),
                              ('output', nn.LogSoftmax(dim=1))
                              ]))
    

    要解决第二个问题,请将images.resize_(32,3*224*224)更改为images.resize_(32, 3, 224, 224)

    P.S。 - 建议 - 您的分类器的第一层输出10000单位非常大。您应该尝试将其保持在原始分类器中的4000左右(如果仅使用第一层的原始权重,那就更好了,因为那些已被证明是时间上的好功能)