为什么我的CNN网络无法学习高级功能

时间:2020-09-25 02:20:51

标签: pytorch conv-neural-network prediction loss-function

我正在尝试通过CNN网络预测上千张图片,如下图所示。接下来,您可以看到CNN架构和一些细节。我想知道为什么网络学习不正确?如您所见,预测图像值(轮廓)与地面真实情况相差甚远(预测值是10倍以上)。似乎网络在边缘检测(低级功能)上表现不错,但在高级功能上却没有。

kernel = 3
num_filters = 12
batch_size = 128
lr = 1e-5

class Model(nn.Module):
    def __init__(self, kernel, num_filters, res = ResidualBlock):
        super(Model, self).__init__()
        
        self.conv0 = nn.Sequential(
            nn.Conv2d(4, num_filters, kernel_size = kernel*3, 
                       padding = 4),
            nn.BatchNorm2d(num_filters),
            nn.ReLU(inplace=True))
            
        
        self.conv1 = nn.Sequential(
            nn.Conv2d(num_filters, num_filters*2, kernel_size = kernel, 
                      stride=2, padding = 1),
            nn.BatchNorm2d(num_filters*2),
            nn.ReLU(inplace=True))
    
        
        self.conv2 = nn.Sequential(
            nn.Conv2d(num_filters*2, num_filters*4, kernel_size = kernel, stride=2, padding = 1),
            nn.BatchNorm2d(num_filters*4),
            nn.ReLU(inplace=True))
        
        
        self.conv3 = nn.Sequential(
            nn.Conv2d(num_filters*4, num_filters*8, kernel_size = kernel, stride=2, padding = 2),
            nn.BatchNorm2d(num_filters*8),
            nn.ReLU(inplace=True))
        
        
        self.conv4 = nn.Sequential(
            nn.Conv2d(num_filters*8, num_filters*16, kernel_size = kernel, stride=2, padding = 1),
            nn.BatchNorm2d(num_filters*16),
            nn.ReLU(inplace=True))
            
             
        self.tsconv0 = nn.Sequential(
            nn.ConvTranspose2d(num_filters*16, num_filters*8, kernel_size = kernel, padding =1),
            nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(num_filters*8))

        self.tsconv1 = nn.Sequential(
            nn.ConvTranspose2d(num_filters*8, num_filters*4, kernel_size = kernel, padding = 1),
            nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(num_filters*4))
        
        self.tsconv2 = nn.Sequential(
            nn.ConvTranspose2d(num_filters*4, num_filters*2, kernel_size = kernel, padding = 1),
            nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(num_filters*2))
        
        self.tsconv3 = nn.Sequential(
            nn.ConvTranspose2d(num_filters*2, num_filters, kernel_size = kernel, padding = 1),
            nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(num_filters))
        
        self.tsconv4 = nn.Sequential(
            nn.Conv2d(num_filters, 1, kernel_size = kernel*3, padding = 0, bias=False),
            nn.ReLU(inplace=True))
        
    def forward(self, x):
        x0 = self.conv0(x)           #([6, 600, 600])
        # print(x0.shape)
        x1 = self.conv1(x0)          #([12, 300, 300])
        # print(x1.shape)
        x2 = self.conv2(x1)          #([24, 150, 150])
        # print(x2.shape)
        x3 = self.conv3(x2)          #([48, 76, 76])
        # print(x3.shape)
        x4 = self.conv4(x3)          #([96, 38, 38])
        # print(x4.shape)       
        x5 = self.tsconv0(x4)        #([48, 76, 76])
        # print(x5.shape)
        x6 = self.tsconv1(x5)        #([24, 152, 152])
        # print(x6.shape)
        x7 = self.tsconv2(x6)        #([12, 304, 304)
        # print(x7.shape)
        x8 = self.tsconv3(x7)        #([6, 608, 608])
        # print(x8.shape)
        x9 = self.tsconv4(x8)        #([1, 600, 600])
        # print(x9.shape)
        return x9

ground truth

Predicted

Loss

1 个答案:

答案 0 :(得分:1)

这是自动编码器,对吗?您只想重建输入图像?下图是输出,上图是地面真理? 然后,我建议您两件事: 首先,似乎您的体系结构并没有真正生成许多功能映射,而是生成了很多功能映射。您将要从6个特征映射到96个。通常在CNN中,您要从6个映射到512个。例如:

layer1: 6 - layer2: 64 - layer3: 128 - layer4: 256 - layer5: 512 ...

这可能是为什么它不能学习高级功能的原因,因为模型的尺寸并不高。您还可以尝试使瓶颈层的要素地图尺寸小于38,大约8-16。

第二件事: 如果您不使用瓶颈层,则可以添加跳过连接。因此,从编码器中提取一层或全部层,并将它们添加到具有相同尺寸的解码器层中。 希望对您有所帮助!