我想为Keras中的多层Perceptron网络编写自定义损失函数。损失有两个组成部分:第一个是规则的“ mse”,第二个是相对于输入要素的输出的元素明智的梯度。假设x
是具有2个特征的输入(大小:样本数X 2),y
是具有单个特征的输出(大小:样本数X 1)。我用每个样本的第一个特征表示每个输出样本的导数为$\frac{dy[:]}{dx[:,0]}$
类似地,我想在损失函数中计算以下表达式:
$$r[:] = y[:] \frac{dy[:]}{dx[:,0]} - x[:,1] \frac{d^2y[:]}{dx[:,0]^2}$$
并取r
向量的均方根。总损耗是正则'mse'与r
向量的均方之和。
这是我尝试过的代码的一个最小的,可复制的示例:
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
import tensorflow as tf
import tensorflow.keras.backend as kb
def custom_loss_envelop(model_inputs, model_outputs):
def custom_loss(y_true,y_pred):
mse_loss = keras.losses.mean_squared_error(y_true, y_pred)
print()
print(model_inputs); print()
print(model_outputs); print()
dy_dx = kb.gradients(model_outputs, tf.gather(model_inputs, [0], axis=1))
print(dy_dx); print()
d2y_dx2 = kb.gradients(dy_dx, tf.gather(model_inputs, [0], axis=1))
print(d2y_dx2); print()
r = tf.multiply(model_outputs, tf.gather(dy_dx, [0], axis=1)) - tf.multiply(tf.gather(model_inputs, [1], axis=1), tf.gather(d2y_dx2, [0], axis=1)) # y*dy_dx[0] - x[1]*d2y_dx[0]2
r = kb.mean(kb.square(r))
loss = mse_loss + r
return loss
return custom_loss
nx=100;
inputs_train=np.random.uniform(0,1,(nx,2)); outputs_train=np.random.uniform(0,1,(nx,1))
inputs_val=np.random.uniform(0,1,(int(nx/2),2)); outputs_val=np.random.uniform(0,1,(int(nx/2),1))
n_hidden_units=50; l2_reg_lambda=0; learning_rate=0.001; dropout_factor=0.0; epochs=3
model = keras.Sequential();
model.add(keras.layers.Dense(n_hidden_units, activation='relu', input_shape=(inputs_train.shape[1],), kernel_regularizer=keras.regularizers.l2(l2_reg_lambda))); #first hidden layer
model.add(keras.layers.Dropout(dropout_factor)); model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(n_hidden_units, activation='relu', kernel_regularizer = keras.regularizers.l2(l2_reg_lambda)));
model.add(keras.layers.Dropout(dropout_factor)); model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(n_hidden_units, activation='relu', kernel_regularizer = keras.regularizers.l2(l2_reg_lambda)));
model.add(keras.layers.Dropout(dropout_factor)); model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dense(outputs_train.shape[1], activation='linear'));
optimizer1 = keras.optimizers.Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=True)
model.compile(loss=custom_loss_envelop(model.inputs, model.outputs), optimizer=optimizer1, metrics=['mse'])
model.fit(inputs_train, outputs_train, batch_size=100, epochs=epochs, shuffle=True, validation_data=(inputs_val,outputs_val), verbose=1)
在这里,我随机生成了训练和验证样本。我得到张量形状如下:
model_inputs:[<tf.Tensor 'dense_input:0' shape=(None, 2) dtype=float32>]
,model_outputs:[<tf.Tensor 'dense_3/Identity:0' shape=(None, 1) dtype=float32>]
和dy_dx:[None]
。前2个与预期的一样,但导数也应具有(None, 1)
的形状,而不是。因此,在行AttributeError: 'NoneType' object has no attribute 'op'
d2y_dx2 = kb.gradients(dy_dx, tf.gather(model_inputs, [0], axis=1))
错误
感谢您提供任何帮助来解决此问题或使用其他解决方案。