我正在做一些机器学习,我必须处理自定义损失函数。损失函数的导数和Hessian很难导出,因此我求助于使用Tensorflow自动计算它们。
这里是一个例子。
import numpy as np
import tensorflow as tf
y_true = np.array([
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 1]
], dtype=float)
y_pred = np.array([
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 1]
], dtype=float)
weights = np.array([1, 1, 1, 1, 1], dtype=float)
with tf.Session():
# We first convert the numpy arrays to Tensorflow tensors
y_true = tf.convert_to_tensor(y_true)
y_pred = tf.convert_to_tensor(y_pred)
weights = tf.convert_to_tensor(weights)
# The following code block is a custom loss
ys = tf.reduce_sum(y_true, axis=0)
y_true = y_true / ys
ln_p = tf.nn.log_softmax(y_pred)
wll = tf.reduce_sum(y_true * ln_p, axis=0)
loss = -tf.tensordot(weights, wll, axes=1)
grad = tf.gradients(loss, y_pred)[0]
hess = tf.hessians(loss, y_pred)[0]
hess = tf.diag_part(hess)
print(hess.eval())
打印出来的
[[0.24090069 0.12669198 0.12669198 0.12669198 0.12669198]
[0.12669198 0.24090069 0.12669198 0.12669198 0.12669198]
[0.12669198 0.12669198 0.12669198 0.24090069 0.12669198]
[0.12669198 0.12669198 0.24090069 0.12669198 0.12669198]
[0.04223066 0.04223066 0.04223066 0.04223066 0.08030023]
[0.04223066 0.04223066 0.04223066 0.04223066 0.08030023]
[0.04223066 0.04223066 0.04223066 0.04223066 0.08030023]]
我对此感到满意,因为它可以工作,问题是它无法扩展。对于我的用例,我只需要Hessian矩阵的对角线。我设法使用hess = tf.diag_part(hess)
提取了它,但这仍然可以计算出完整的Hessian,这是不必要的。开销如此之大,以至于我无法将其用于中等大小的数据集(约10万行)。