对于NN的自定义丢失,我使用函数。 u ,给定一对(t,x),间隔中的两个点都是我的NN的输出。问题是我如何使用K.gradient
计算二阶导数(K是TensorFlow后端):
def custom_loss(input_tensor, output_tensor):
def loss(y_true, y_pred):
# so far, I can only get this right, naturally:
gradient = K.gradients(output_tensor, input_tensor)
# here I'm falling badly:
# d_t = K.gradients(output_tensor, input_tensor)[0]
# dd_x = K.gradient(K.gradients(output_tensor, input_tensor),
# input_tensor[1])
return gradient # obviously not useful, just for it to work
return loss
基于Input(shape=(2,))
的所有尝试都是上面代码段中注释行的变体,主要是试图找到生成的张量的正确索引。
果然我不知道张量的确切运作方式。顺便说一句,我知道在TensorFlow本身我可以简单地使用tf.hessian
,但我注意到当使用TF作为后端时它就不存在了。
答案 0 :(得分:3)
为了让K.gradients()
图层像这样工作,您必须将其封装在Lambda()
图层中,因为否则不会创建完整的Keras图层,并且您无法将其链接或通过它训练。所以这段代码将起作用(测试):
import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf
def grad( y, x ):
return Lambda( lambda z: K.gradients( z[ 0 ], z[ 1 ] ), output_shape = [1] )( [ y, x ] )
def network( i, d ):
m = Add()( [ i, d ] )
a = Lambda(lambda x: K.log( x ) )( m )
return a
fixed_input = Input(tensor=tf.constant( [ 1.0 ] ) )
double = Input(tensor=tf.constant( [ 2.0 ] ) )
a = network( fixed_input, double )
b = grad( a, fixed_input )
c = grad( b, fixed_input )
d = grad( c, fixed_input )
e = grad( d, fixed_input )
model = Model( inputs = [ fixed_input, double ], outputs = [ a, b, c, d, e ] )
print( model.predict( x=None, steps = 1 ) )
def network
x = 1 STRONG>。 def grad
是完成渐变计算的地方。此代码输出:
[array([1.0986123],dtype = float32),array([0.33333334],dtype = float32),array([ - 0.11111112],dtype = float32),array([0.07407408],dtype = float32),array ([-0.07407409],dtype = float32)]
log(3) , ⅓ , 的正确值-1 / 3 2 , 2/3 3 , -6 / 3 4 。
作为参考,普通TensorFlow中的相同代码(用于测试):
import tensorflow as tf
a = tf.constant( 1.0 )
a2 = tf.constant( 2.0 )
b = tf.log( a + a2 )
c = tf.gradients( b, a )
d = tf.gradients( c, a )
e = tf.gradients( d, a )
f = tf.gradients( e, a )
with tf.Session() as sess:
print( sess.run( [ b, c, d, e, f ] ) )
输出相同的值:
[1.0986123,[0.33333334],[ - 0.11111112],[0.07407408],[ - 0.040407409]]
tf.hessians()
会返回二阶导数,这是链接两个tf.gradients()
的简写。但是Keras后端没有hessians
,因此您必须将两个K.gradients()
链接起来。
如果出于某种原因,上述情况都不起作用,那么您可能需要考虑数值近似二阶导数,并在 ε 距离上取差。对于每个输入 ,这基本上是网络 的三倍,因此除了缺乏准确性之外,此解决方案还引入了严格的效率考虑因素。无论如何,代码(测试):
import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf
def network( i, d ):
m = Add()( [ i, d ] )
a = Lambda(lambda x: K.log( x ) )( m )
return a
fixed_input = Input(tensor=tf.constant( [ 1.0 ], dtype = tf.float64 ) )
double = Input(tensor=tf.constant( [ 2.0 ], dtype = tf.float64 ) )
epsilon = Input( tensor = tf.constant( [ 1e-7 ], dtype = tf.float64 ) )
eps_reciproc = Input( tensor = tf.constant( [ 1e+7 ], dtype = tf.float64 ) )
a0 = network( Subtract()( [ fixed_input, epsilon ] ), double )
a1 = network( fixed_input, double )
a2 = network( Add()( [ fixed_input, epsilon ] ), double )
d0 = Subtract()( [ a1, a0 ] )
d1 = Subtract()( [ a2, a1 ] )
dv0 = Multiply()( [ d0, eps_reciproc ] )
dv1 = Multiply()( [ d1, eps_reciproc ] )
dd0 = Multiply()( [ Subtract()( [ dv1, dv0 ] ), eps_reciproc ] )
model = Model( inputs = [ fixed_input, double, epsilon, eps_reciproc ], outputs = [ a0, dv0, dd0 ] )
print( model.predict( x=None, steps = 1 ) )
输出:
[array([1.09861226]),array([0.33333334]),array([ - 0.1110223])]
(这只是二阶导数。)
答案 1 :(得分:0)
Peter Szoldan发布的解决方案是一个很好的解决方案。但是自从最新版本的tf2后端以来,keras.layers.Input()接受参数的方式似乎已经改变。不过,以下简单解决方法将起作用:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import backend as K
import numpy as np
class CustomModel(tf.keras.Model):
def __init__(self):
super(CustomModel, self).__init__()
self.input_layer = Lambda(lambda x: K.log( x + 2 ) )
def findGrad(self,func,argm):
return keras.layers.Lambda(lambda x: K.gradients(x[0],x[1])) ([func,argm])
def call(self, inputs):
log_layer = self.input_layer(inputs)
gradient_layer = self.findGrad(log_layer,inputs)
hessian_layer = self.findGrad(gradient_layer, inputs)
return hessian_layer
custom_model = CustomModel()
x = np.array([[0.],
[1],
[2]])
custom_model.predict(x)