我正在尝试编写变量自动编码器的实现,但是我在丢失函数方面遇到了一些困难:
def vae_loss(sigma, mu):
def loss(y_true, y_pred):
recon = K.sum(K.binary_crossentropy(y_true, y_pred), axis=-1)
kl = 0.5 * K.sum(K.exp(sigma) + K.square(mu) - 1. - sigma, axis=-1)
return recon + kl
return loss
二进制crossentropy部分工作正常,但每当我只返回发散项kl进行测试时,我得到以下错误: ValueError:“试图将'x'转换为张量并失败。错误:不支持任何值。”
我期待着可能提示我做错了什么。您将在下面找到我的完整代码。谢谢你的时间!
import numpy as np
from keras import Model
from keras.layers import Input, Dense, Lambda
import keras.backend as K
from keras.datasets import mnist
from matplotlib import pyplot as plt
class VAE(object):
def __init__(self, n_latent, batch_size):
self.encoder, self.encoder_input, self.mu, self.sigma = self.create_encoder(n_latent, batch_size)
self.decoder, self.decoder_input, self.decoder_output = self.create_decoder(n_latent, batch_size)
pipeline = self.decoder(self.encoder.outputs[0])
def vae_loss(sigma, mu):
def loss(y_true, y_pred):
recon = K.sum(K.binary_crossentropy(y_true, y_pred), axis=-1)
kl = 0.5 * K.sum(K.exp(sigma) + K.square(mu) - 1. - sigma, axis=-1)
return recon + kl
return loss
self.VAE = Model(self.encoder_input, pipeline)
self.VAE.compile(optimizer="adadelta", loss=vae_loss(self.sigma, self.mu))
def create_encoder(self, n_latent, batch_size):
input_layer = Input(shape=(784,))
#net = Dense(512, activation="relu")(input_layer)
mu = Dense(n_latent, activation="linear")(input_layer)
print(mu)
sigma = Dense(n_latent, activation="linear")(input_layer)
def sample_z(args):
mu, log_sigma = args
eps = K.random_normal(shape=(K.shape(input_layer)[0], n_latent), mean=0., stddev=1.)
K.print_tensor(K.shape(eps))
return mu + K.exp(log_sigma / 2) * eps
sample_z = Lambda(sample_z)([mu, sigma])
model = Model(inputs=input_layer, outputs=[sample_z, mu, sigma])
return model, input_layer, mu, sigma
def create_decoder(self, n_latent, batch_size):
input_layer = Input(shape=(n_latent,))
#net = Dense(512, activation="relu")(input_layer)
reconstruct = Dense(784, activation="linear")(input_layer)
model = Model(inputs=input_layer, outputs=reconstruct)
return model, input_layer, reconstruct
答案 0 :(得分:0)
我将假设当您在反向传播期间“测试”/调试您的训练阶段时出现错误(如果我错了,请告诉我)。
如果是这样,问题是您要求Keras优化整个网络(kl
),同时使用仅覆盖编码器部分的丢失(recon
)。解码器的梯度保持不确定(没有kl
覆盖它的损失),导致优化错误。
为了您的调试目的,如果您尝试编译并仅适合具有此截断损耗(K.sum(y_pred - y_pred, axis=-1) + kl
)的编码器,或者如果您想出一个虚拟(可微分)损失,则错误将消失(例如{{1}})。