Question

我有一个像 NET 一样的（来自 here 的例子）

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
net = Net()

和另一个网状（来自here的例子）

class binaryClassification(nn.Module):
    def __init__(self):
        super(binaryClassification, self).__init__()
        # Number of input features is 12.
        self.layer_1 = nn.Linear(12, 64) 
        self.layer_2 = nn.Linear(64, 64)
        self.layer_out = nn.Linear(64, 1) 
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(p=0.1)
        self.batchnorm1 = nn.BatchNorm1d(64)
        self.batchnorm2 = nn.BatchNorm1d(64)
        
    def forward(self, inputs):
        x = self.relu(self.layer_1(inputs))
        x = self.batchnorm1(x)
        x = self.relu(self.layer_2(x))
        x = self.batchnorm2(x)
        x = self.dropout(x)
        x = self.layer_out(x)
        return x

我想更改，例如“self.fc2 = nn.Linear(120, 84)”以便有 121 个输入，其中第 121 个是 binaryClassification 网络的 x（输出）。这个想法是：我想同时使用 CNN 网络和非 CNN 网络来训练两者，并相互影响。

有可能吗？我怎样才能做到这一点？（Keras 或 Pytorch 示例都可以）。

或者这个想法很疯狂，有更简单的方法将数据和图像混合作为唯一网络的输入？

Answer 1

最简单的方法是实例化两个模型，将两个预测相加并用它计算损失。这将通过两种模型进行反向传播：

net1 = Net1()
net2 = Net2()
bce = torch.nn.BCEWithLogitsLoss()
params = list(net1.parameters()) + list(net2.parameters())
optimizer = optim.SGD(params)
for (x, ground_truth) in enumerate(your_data_loader):
    optimizer.zero_grad()
    prediction = net1(x) + net2(x)  # the 2 models must output tensors of same shape
    loss = bce(prediction, ground_truth)
    train_loss.backward()
    optimizer.step()

你也可以例如

在单个模型中实现 Net1 和 Net2 层
分别训练 Net1 和 Net2，然后将它们集成

Answer 2

这是一种完全有效的方法，您使用两个不同的输入数据源，处理它们并将结果组合起来以解决一个共同的目标（在这种情况下，它似乎是一个 10 类图像分类）。您可以将 Net 网络的输入定义为原始 image 所需的 Net 元组和 BinaryClassificator 的 features 12 值向量。示例代码是：

import torch
import torch.nn as nn

class binaryClassification(nn.Module):
   #> ...same as above

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension
        self.binClas = binaryClassification()        
        self.fc2 = nn.Linear(121, 84)
        self.fc3 = nn.Linear(84, 10)
  
    def forward(self, inputs):
        x, features = inputs    # split tuple
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        # Concatenate with BinaryClassification
        x = torch.cat([F.relu(self.fc1(x)), self.binClas(features)])
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
net = Net()

但是！一起训练它们时要小心，很难平衡网络中的两个分支来让它们学习。我建议您在将它们连接在一起之前分别训练它们一段时间（一般来说，网络一部分的超参数可能对另一部分不是最佳的）。为此，您可以冻结网络的一部分，同时训练另一部分，反之亦然。（check this link 了解如何冻结部分火炬 nn）

是否可以结合 2 个神经网络？

2 个答案: