Question

我大约有5个模型，每个模型都经过很好的训练，但是我想将它们融合在一起，以获得一个大模型。我正在研究它，因为一个大型模型（在生产中）比许多小型模型更容易更新（在生产中）这是我想要实现的图像。

我的问题是，这样做可以吗？每个人头模型只有一个数据集，我应该如何训练整个模型？

Answer 1

我的问题是，可以这样做吗

当然可以。这种方法称为multi-task learning。根据您的数据集和您要执行的操作，它甚至可能会提高性能。微软使用multi-task model在NLP Glue基准测试中取得了不错的成绩，但他们也指出，您可以通过微调每个任务的联合模型来进一步提高性能。

每个人头模型只有一个数据集，我应该如何训练整个模型？

您需要的只是pytorch ModuleList：

#please note this is just pseudocode and I'm not well versed with computer vision
#therefore you need to check if resnet50 import is correct and look 
#for the imports of the task specific stuff
from torch import nn
from torchvision.models import resnet50

class MultiTaskModel(nn.Module):
    def __init__(self):
        #shared part
        self.resnet50 = resnet50()

        #task specific stuff
        self.tasks = nn.ModuleList()
        self.tasks.add_module('depth', Depth())
        self.tasks.add_module('denseflow', Denseflow())
        #...

    def forward(self, tasktag, ...):
        #shared part
        resnet_output = self.resnet50(...)

        #task specific parts
        if tasktag == 'depth':
            return self.tasks.depth(resnet_output)
        elif tasktag == 'denseflow':
            return self.tasks.denseflow(resnet_output)
        #...

Answer 2

对于一个想法，您可以检查Detectron2项目，尤其是joined in的模型。

有可能他们也使用了一些想法。

将模型融合在一起意味着定义主模型（包含子模型）的输入和输出。

如何进行多头学习

2 个答案: