我创建了一个自定义损失函数,如下所示:
import tensorflow.keras.backend as K
def custom_loss(y_true, y_pred):
y_true = K.cast(y_true, tf.float32)
y_pred = K.cast(y_pred, tf.float32)
mask = K.sign(y_true) * K.sign(y_pred)
mask = ((mask * -1) + 1) / 2
losses = K.abs(y_true * mask)
return K.sum(losses)
但是,当我尝试使用此损失函数训练模型时,我没有进行任何训练。 该模型可与其他损失函数(例如mse和mae)正常工作,我已经尝试了所有学习率和模型复杂性。
以下是我不知道要进行培训的情况。
model = get_compiled_model()
print(model.predict(train_x)[:10])
model.fit(train_x, train_y, epochs=5, verbose=1)
print(model.predict(train_x)[:10])
model.fit(train_x, train_y, epochs=5, verbose=1)
print(model.predict(train_x)[:10])
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
Epoch 1/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 2/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 3/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 4/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 5/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
Epoch 1/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
Epoch 2/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 3/5
1/1 [==============================] - 0s 2ms/step - loss: 0.0179
Epoch 4/5
1/1 [==============================] - 0s 951us/step - loss: 0.0179
Epoch 5/5
1/1 [==============================] - 0s 1ms/step - loss: 0.0179
[[0.19206487]
[0.19201839]
[0.19199933]
[0.19199185]
[0.19206186]
[0.19208357]
[0.1920282 ]
[0.19203594]
[0.1919941 ]
[0.19202243]]
上面代码中的2d数组是模型的前10个预测,即使经过5个训练周期也不会丝毫改变。
我的直觉告诉我损失函数有问题,但是我不知道是什么。
模型如下所示
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, input_dim=2*training_size+1, activation='softmax'),
tf.keras.layers.Dense(10, activation='softmax'),
tf.keras.layers.Dense(1, activation='tanh')
])
opt = tf.keras.optimizers.Adam(learning_rate=0.0005)
model.compile(optimizer=opt,
loss=custom_loss,
metrics=[])
return model
答案 0 :(得分:1)
我正在使用您的模型和损失函数来处理一些假数据,我想check the derivatives。
if __name__=="__main__":
m = get_compiled_model()
x = numpy.random.random( (1000, 21))
x = numpy.array(x, dtype="float32")
exp_y = numpy.random.random( (1000, 1))
exp_y = (exp_y>0.5)*1.0
with tf.GradientTape() as tape:
y = m(x)
loss = custom_loss(y, exp_y)
#loss = keras.losses.mse(y, exp_y)
grad = tape.gradient(loss, m.trainable_variables)
for var, g in zip(m.trainable_variables, grad):
print(f'{var.name}, shape: {K.sum(g*g)}')
对于mse损失功能:
致密/内核:0,形状:2817.013671875
密度/偏差:0,形状:530.52197265625
density_1 / kernel:0,形状:3826.3974609375
density_1 / bias:0,形状:25160.9375
density_2 / kernel:0,形状:125238.34375
density_2 / bias:0,形状:1241268.5
用于自定义损失功能
密集/内核:0,形状:34.87071228027344
密度/偏差:0,形状:6.6609962463378906
density_1 / kernel:0,形状:107.27591705322266
density_1 / bias:0,形状:824.83740234375
density_2 / kernel:0,形状:5944.91796875
density_2 / bias:0,形状:59201.58203125
我们可以看到导数之和的数量级不同。即使有了这些随机数据,MSE损失函数也会导致模型的输出随时间变化。
可能只有我制作的假数据才是这种情况。