我正在尝试使用Keras构建此基线自动编码器中每层的激活值,因为我想根据Kullbach-Leibler(KL)偏差为损失函数添加稀疏性惩罚,如图所示{{3 }}。
在这种情况下,我将计算每一层的KL分歧,然后将所有这些分别与主要损失函数相加,例如: MSE
因此,我在Jupyter制作了一个脚本,但是我一直这样做,当我尝试编译时,我得到ZeroDivisionError: integer division or modulo by zero
。
这是代码
import numpy as np
from keras.layers import Conv2D, Activation
from keras.models import Sequential
from keras import backend as K
from keras import losses
x_train = np.random.rand(128,128).astype('float32')
kl = K.placeholder(dtype='float32')
beta = K.constant(value=5e-1)
p = K.constant(value=5e-2)
# encoder
model = Sequential()
model.add(Conv2D(filters=16,kernel_size=(4,4),padding='same',
name='encoder',input_shape=(128,128,1)))
model.add(Activation('relu'))
# get the average activation
A = K.mean(x=model.output)
# calculate the value for the KL divergence
kl = K.concatenate([kl, losses.kullback_leibler_divergence(p, A)],axis=0)
# decoder
model.add(Conv2D(filters=1,kernel_size=(4,4),padding='same', name='encoder'))
model.add(Activation('relu'))
B = K.mean(x=model.output)
kl = K.concatenate([kl, losses.kullback_leibler_divergence(p, B)],axis=0)
这似乎是原因
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in _normalize_axis(axis, ndim)
989 else:
990 if axis is not None and axis < 0:
991 axis %= ndim <----------
992 return axis
993
所以平均计算可能有问题。如果我打印我得到的值
Tensor("Mean_10:0", shape=(), dtype=float32)
这很奇怪,因为权重和偏差是非零初始化的。因此,获取激活值的方式可能有问题。
我真的不太热衷于解决它,我不是一个熟练的程序员。 谁能帮助我理解我错在哪里?
答案 0 :(得分:1)
首先,你不应该在图层之外进行计算。 model
必须跟踪所有计算。
如果您需要在模型中间进行特定计算,则应使用Lambda
图层。
如果您需要在损失函数中使用特定输出,则应拆分该输出的模型并在自定义损失函数内进行计算。
在这里,我使用Lambda
图层来计算均值,并使用customLoss
来计算kullback-leibler发散度。
import numpy as np
from keras.layers import *
from keras.models import Model
from keras import backend as K
from keras import losses
x_train = np.random.rand(128,128).astype('float32')
kl = K.placeholder(dtype='float32') #you'll probably not need this anymore, since losses will be treated individually in each output.
beta = beta = K.constant(value=5e-1)
p = K.constant(value=5e-2)
# encoder
inp = Input((128,128,1))
lay = Convolution2D(filters=16,kernel_size=(4,4),padding='same', name='encoder',activation='relu')(inp)
#apply the mean using a lambda layer:
intermediateOut = Lambda(lambda x: K.mean(x),output_shape=(1,))(lay)
# decoder
finalOut = Convolution2D(filters=1,kernel_size=(4,4),padding='same', name='encoder',activation='relu')(lay)
#but from that, let's also calculate a mean output for loss:
meanFinalOut = Lambda(lambda x: K.mean(x),output_shape=(1,))(finalOut)
#Now, you have to create a model taking one input and those three outputs:
splitModel = Model(inp,[intermediateOut,meanFinalOut,finalOut])
最后,使用自定义损失函数编译模型(我们稍后会定义)。但由于我不知道你是否真的使用最终输出(非平均值)进行训练,我建议创建一个用于训练的模型和另一个用于预测的模型:
trainingModel = Model(inp,[intermediateOut,meanFinalOut])
trainingModel.compile(...,loss=customLoss)
predictingModel = Model(inp,finalOut)
#you don't need to compile the predicting model since you're only training the trainingModel
#both will share the same weights, you train one, and predict in the other
我们的自定义损失函数应该处理kullback。
def customLoss(p,mean):
return #your own kullback expression (I don't know how it works, but maybe keras' one can be used with single values?)
或者,如果您想要调用单个损失函数而不是两个:
summedMeans = Add([intermediateOut,meanFinalOut])
trainingModel = Model(inp, summedMeans)