我对nn.linear的功能感到困惑。对于模型VGG-19的最后nn.MaxPool2d的功能不足,张量大小为(512,7,7)。下面的模型使用池化功能并将张量调整为(512,49),然后直接使用nn.linear(512,7)。为什么没有匹配问题就无法成功工作?
'''VGG11/13/16/19 in Pytorch.'''
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
cfg = {
'VGG11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'VGG13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'VGG16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
'VGG19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}
class VGG(nn.Module):
def __init__(self, vgg_name):
super(VGG, self).__init__()
self.features = self._make_layers(cfg[vgg_name])
self.classifier = nn.Linear(512, 7)
def forward(self, x):
out = self.features(x)
out = out.view(out.size(0), -1)
out = F.dropout(out, p=0.5, training=self.training)
out = self.classifier(out)
return out
def _make_layers(self, cfg):
layers = []
in_channels = 3
for x in cfg:
if x == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
else:
layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1),
nn.BatchNorm2d(x),
nn.ReLU(inplace=True)]
in_channels = x
layers += [nn.AvgPool2d(kernel_size=1, stride=1)]
return nn.Sequential(*layers)
答案 0 :(得分:0)
为什么假定此代码有效?我对其进行了测试,并得到了以下形状以及预期的尺寸不匹配错误。
def forward(self, x):
out = self.features(x) # torch.Size([1, 512, 7, 7])
out = out.view(out.size(0), -1) # torch.Size([1, 25088])
out = F.dropout(out, p=0.5, training=self.training) # torch.Size([1, 25088])
out = self.classifier(out) # RuntimeError: size mismatch, m1: [1 x 25088], m2: [512 x 7]
return out
推断尺寸时犯的一个错误是省略了批次尺寸。这就是为什么您可能会错误地得出结论,out.view(out.size(0), -1)
的形状变化是[512,7,7]-> [512,49]而不是正确的[b,512,7,7]-> [b,25088] ],其中b是批次大小。
按预期将分类器更改为
self.classifier = nn.Linear(25088, 7)
然后前进功能起作用,没有大小不匹配错误。