Question

我的问题类似于这里提出的问题： keras combining two losses with adjustable weights

但是，输出的维数不同，导致输出无法连接。因此，该解决方案不适用，还有另一种解决此问题的方法吗？

问题：

我有一个keras函数模型，具有两层，输出分别为x1和x2。

x1 = Dense(1,activation='relu')(prev_inp1)

x2 = Dense(2,activation='relu')(prev_inp2)

我需要使用x1和x2在加权损失函数中使用它们，如所附图像所示。将“相同损失”传播到两个分支。 Alpha可以灵活地随迭代而变化。

Answer 1

对于这个问题，需要更详细的解决方案。由于我们将使用可训练的权重，因此我们需要一个自定义图层。

此外，我们将需要另一种形式的培训，因为我们的损失不会像其他人只接受y_true和y_pred那样工作，并考虑加入两个不同的输出。

因此，我们将创建同一模型的两个版本，一个用于预测，另一个用于训练，并且训练版本将使用编译时的虚拟keras损失函数来包含损失本身。

预测模型

让我们使用一个非常基本的模型示例，该模型具有两个输出和一个输入：

#any input your true model takes
inp = Input((5,5,2))

#represents the localization output
outImg = Conv2D(1,3,activation='sigmoid')(inp)

#represents the classification output
outClass = Flatten()(inp)
outClass = Dense(2,activation='sigmoid')(outClass)

#the model
predictionModel = Model(inp, [outImg,outClass])

您经常将其用于预测。无需编译此代码。

每个分支的损失

现在，让我们为每个分支创建自定义损失函数，一个为LossCls，另一个为LossLoc。

在这里使用虚拟示例，可以在必要时更好地阐述这些损失。最重要的是，它们输出形状为（batch，1）或（batch，）的批次。两者输出的形状相同，因此可以在以后进行求和。

def calcImgLoss(x):
    true,pred = x
    loss = binary_crossentropy(true,pred)
    return K.mean(loss, axis=[1,2])

def calcClassLoss(x):
    true,pred = x
    return binary_crossentropy(true,pred)

这些将在训练模型的Lambda层中使用。

损失加权层

现在，让我们用可训练的alpha来衡量损失。可训练参数需要实现自定义层。

class LossWeighter(Layer):
    def __init__(self, **kwargs): #kwargs can have 'name' and other things
        super(LossWeighter, self).__init__(**kwargs)

    #create the trainable weight here, notice the constraint between 0 and 1
    def build(self, inputShape):
        self.weight = self.add_weight(name='loss_weight', 
                                     shape=(1,),
                                     initializer=Constant(0.5), 
                                     constraint=Between(0,1),
                                     trainable=True)
        super(LossWeighter,self).build(inputShape)

    def call(self,inputs):
        firstLoss, secondLoss = inputs
        return (self.weight * firstLoss) + ((1-self.weight)*secondLoss)

    def compute_output_shape(self,inputShape):
        return inputShape[0]

请注意，存在一个自定义约束来将该权重保持在0到1之间。此约束通过以下方式实现：

class Between(Constraint):
    def __init__(self,min_value,max_value):
        self.min_value = min_value
        self.max_value = max_value

    def __call__(self,w):
        return K.clip(w,self.min_value, self.max_value)

    def get_config(self):
        return {'min_value': self.min_value,
                'max_value': self.max_value}

培训模式

此模型将以预测模型为基础，最后添加损失计算和损失加权器，仅输出损失值。因为它仅输出损失，所以我们将使用真实目标作为输入，并使用如下定义的虚拟损失函数：

def ignoreLoss(true,pred):
    return pred #this just tries to minimize the prediction without any extra computation

模型输入：

#true targets
trueImg = Input((3,3,1))
trueClass = Input((2,))

#predictions from the prediction model
predImg = predictionModel.outputs[0]
predClass = predictionModel.outputs[1]

模型输出=损失：

imageLoss = Lambda(calcImgLoss, name='loss_loc')([trueImg, predImg])
classLoss = Lambda(calcClassLoss, name='loss_cls')([trueClass, predClass])
weightedLoss = LossWeighter(name='weighted_loss')([imageLoss,classLoss])

型号：

trainingModel = Model([predictionModel.input, trueImg, trueClass], weightedLoss)
trainingModel.compile(optimizer='sgd', loss=ignoreLoss)

虚拟训练

inputImages = np.zeros((7,5,5,2))
outputImages = np.ones((7,3,3,1))
outputClasses = np.ones((7,2))
dummyOut = np.zeros((7,))

trainingModel.fit([inputImages,outputImages,outputClasses], dummyOut, epochs = 50)
predictionModel.predict(inputImages)

必要进口

from keras.layers import *
from keras.models import Model
from keras.constraints import Constraint
from keras.initializers import Constant
from keras.losses import binary_crossentropy #or another you need

Answer 2

不需要连接您的输出。要将多个参数传递给损失函数，可以如下包装：

def custom_loss(x1, x2, y1, y2, alpha):
    def loss(y_true, y_pred):
        return (1-alpha) * loss_cls(y1, x1) + alpha * loss_loc(y2, x2)
    return loss

然后将您的功能模型编译为：

x1 = Dense(1, activation='relu')(prev_inp1)
x2 = Dense(2, activation='relu')(prev_inp2)
y1 = Input((1,))
y2 = Input((2,))

model.compile('sgd',
              loss=custom_loss(x1, x2, y1, y2, 0.5),
              target_tensors=[y1, y2])

注意：未测试。

喀拉拉邦将两个损失与可调整的权重相结合，其中输出的维数不相同

2 个答案: