我想基本上运行一个自动编码器,但是输入是模型的一部分。现在,我需要知道如何正确定义损耗并在输入层停止反向传播。
这是我发布的另一个问题的子问题:How to create an autoencoder where each layer of encoder should represent the same as a layer of the decoder。在这个问题中,我尝试创建一个自动编码器,在该编码器中,每对对应的层也被训练为自动编码器。遗憾的是,用于定义子自动编码器的自定义正则化层并没有在子自动编码器的开头停止反向传播。
from keras.layers import Dense
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Input
import keras
import numpy as np
# Define shape of the network, layers, some hyperparameters and training data
sizes = [784, 400, 200, 100, 50]
up_layers = []
down_layers = []
for i in range(1, len(sizes)):
layer = Dense(units=sizes[i], activation='sigmoid', input_dim=sizes[i-1])
up_layers.append(layer)
for i in range(len(sizes)-2, -1, -1):
layer = Dense(units=sizes[i], activation='sigmoid', input_dim=sizes[i+1])
down_layers.append(layer)
batch_size = 128
num_classes = 10
epochs = 1
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
x_train = x_train.reshape([x_train.shape[0], 28*28])
x_test = x_test.reshape([x_test.shape[0], 28*28])
y_train = x_train
y_test = x_test
optimizer = keras.optimizers.Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
output = input = Input(shape=(sizes[0],))
#define first layer and set not to propagate the gradients to it
output_to_train = up_layers[0](output)
output = keras.layers.Lambda(lambda x: keras.backend.stop_gradient(x))(output_to_train)
for i in range(1, len(up_layers)):
output = up_layers[i](output)
for i in range(0, len(down_layers)-1):
output = down_layers[i](output)
crs = [output]
losses = [keras.losses.mean_squared_error]
network = Model([input], crs)
network.compile(loss=losses, optimizer=optimizer)
training_data = [output_to_train(y_train)] #should contain the output of the first layer for the training data, but dynamic
test_data = [output_to_train(y_test)] #should contain the output of the first layer for the test data, but dynamic
# Save the weights of first layer before training (those should remain unchanged)
weights_before = up_layers[0].get_weights()[0]
network.fit(x_train, training_data, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, test_data))
# Compare weights to weights after training (the sum of the difference has to be zero)
weights_after = up_layers[0].get_weights()[0]
print('\nSum of difference:', np.sum(weights_before-weights_after))
在所示的代码中,我只是通过第一层分析训练数据,并将该结果用作我想要的输出。这不是动态的,而是在训练之前创建一次。由于我希望能够通过所示模型其余部分的另一种方法来训练第一层,因此这不是一个准确的表示。