问题是要为多任务问题创建一个网络,该网络中未提供所有目标,例如我们有一个256长的二进制输入向量和8个输出,而并非全部提供了8个数字。此外,已知目标各不相同,因此一个样本可能拥有全部8个已知目标,而另一个仅目标第二,第三,第五和第七。如何正确屏蔽网络中的输出,以使传播的错误仅流过已知输出并在已知输出上进行计算?
我想到了这样的东西: 它基于已知的输出计算准确性和误差(来源:https://github.com/keras-team/keras/issues/3893) 玩具问题:
#toy problem
MASK_VALUE = -1
n = 25 # # datapoints
n_tasks = 8 # tasks / # binary classes
input_dim = 256 # vector size
# generate random X vectors and random
# Y labels (binary labels [0,1] or -1 for missing value
x_tr = np.random.rand(n, input_dim)
x_test = np.random.rand(5, input_dim)
y_tr = np.random.randint(3, size=(n, n_tasks))-1
MASK_VALUE = -1
def build_masked_loss(loss_function, mask_value=MASK_VALUE):
"""Builds a loss function that masks based on targets
Args:
loss_function: The loss function to mask
mask_value: The value to mask in the targets
Returns:
function: a loss function that acts like loss_function with masked inputs
"""
def masked_loss_function(y_true, y_pred):
mask = K.cast(K.not_equal(y_true, mask_value), K.floatx())
return loss_function(y_true * mask, y_pred * mask)
return masked_loss_function
def masked_accuracy(y_true, y_pred):
dtype = K.floatx()
total = K.sum(K.cast(K.not_equal(y_true, MASK_VALUE), dtype))
correct = K.sum(K.cast(K.equal(y_true, K.round(y_pred)), dtype))
return correct / total
然后我创建了以下模型:
input_values = Input(shape=(input_dim,))
x = Dense(1024, activation='relu')(input_values)
x = Dense(256, activation='relu')(x)
x = Dense(64, activation='relu')(x)
network_outputs = Dense(num_of_targets, activation='sigmoid')(x)
model = Model(inputs=input_values, outputs=network_outputs)
model.compile(loss=build_masked_loss(K.binary_crossentropy), optimizer='adam', metrics=[masked_accuracy])
问题在于它不能预测多个值,而每次只能预测1个值。我在这里找到了一个解决方案:https://github.com/keras-team/keras/issues/3206,但已经过时了;使用掩码和乘法修改了代码:
mask_input = Input(shape=(n_tasks,))
mask = Masking(mask_value=-1)(mask_input)
network_outputs = Multiply()([network_outputs, mask])
model = Model(inputs=[input_values, mask_input], outputs=network_outputs)
但它返回
InvalidArgumentError Traceback (most recent call last)
<ipython-input-260-2344b9c44c60> in <module>()
----> 1 model.fit(x=[x_tr, y_tr], y=y_tr)
[..]
InvalidArgumentError: Incompatible shapes: [25,8] vs. [25]
[[{{node loss_21/multiply_30_loss/mul_2}}]]
我试图找到类似的问题,但没有结果。
更新 我将代码更改为
input_values = Input(shape=(input_dim,))
mask_input = Input(shape=(1, n_tasks))
mask = Masking(mask_value=-1, input_shape=(None, n_tasks))(mask_input)
x = Dense(1024, activation='relu')(input_values)
x = Dense(256, activation='relu')(x)
x = Dense(64, activation='relu')(x)
network_outputs = Dense(n_tasks, activation='sigmoid')(x)
network_outputs = Multiply()([network_outputs, mask])
model = Model(inputs=[input_values, mask_input], outputs=network_outputs)
model.compile(loss=build_masked_loss(K.binary_crossentropy), optimizer='adam', metrics=[masked_accuracy])#(optimizer=Adam(lr=1e-4), loss=K.binary_crossentropy, metrics=['accuracy'])
并在将输出向量传递到网络之前调整其大小
y_t = y_t.reshape(n, 1, n_tasks)
。它开始训练,但预测的输出仅为1。
model.fit(x=[x_tr, y_tr], y=y_tr, epochs=10)
[...]
Epoch 10/10
25/25 [==============================] - 0s 495us/step - loss: 5.1481e-04 - masked_accuracy: 1.5267
测试:
full_mask = np.array([1] * n_tasks)
model.predict([x_tr[0].reshape(1, -1), full_mask.reshape(1, 1, n_tasks)])
array([[[0.9953357 , 0.9996921 , 0.99997514, 0.99980736, 0.999151 ,
0.9999542 , 0.9999195 , 0.99996036]]], dtype=float32)