如何防止简单的keras autoencoder过度压缩数据?

时间:2017-11-15 00:29:47

标签: machine-learning tensorflow neural-network keras autoencoder

我正在尝试将tensorflow前端与autoencoder后端用作简单的keras作为多维缩放技术,将多维数据绘制为2维。很多时候,当我运行它时(不确定如何为from sklearn.datasets import load_iris from sklearn import model_selection import tensorflow as tf import pandas as pd import numpy as np import matplotlib.pyplot as plt # Load data X = load_iris().data Y = pd.get_dummies(load_iris().target).as_matrix() X_tr, X_te, Y_tr, Y_te = model_selection.train_test_split(X,Y, test_size=0.3, stratify=Y.argmax(axis=1)) dims = X_tr.shape[1] n_classes = Y_tr.shape[1] # Autoencoder encoding_dim = 2 # this is our input placeholder input_data = tf.keras.Input(shape=(4,)) # "encoded" is the encoded representation of the input encoded = tf.keras.layers.Dense(encoding_dim, activation='relu', )(input_data) # "decoded" is the lossy reconstruction of the input decoded = tf.keras.layers.Dense(4, activation='sigmoid')(encoded) # this model maps an input to its reconstruction autoencoder = tf.keras.models.Model(input_data, decoded) # this model maps an input to its encoded representation encoder = tf.keras.models.Model(input_data, encoded) autoencoder.compile(optimizer='adam', loss='binary_crossentropy') network_training = autoencoder.fit(X_tr, X_tr, epochs=100, batch_size=5, shuffle=True, verbose=False, validation_data=(X_te, X_te)) # Plot data embeddings = encoder.predict(X_te) plt.scatter(embeddings[:,0], embeddings[:,1], c=Y_te.argmax(axis=1), edgecolor="black", linewidth=1) btw设置随机种子)其中一个维度被折叠以产生一维嵌入(该图应该有助于解释)。为什么会这样? 如何确保自动编码器保留和使用维度?我意识到这是我已实现的自动编码器的最简单和最基本的形式,但我想在此基础上做得更好自动编码器用于此任务。

Ai,j =

{1, if i does not equal j

{n, if i = j}

运行一次算法

enter image description here

再次运行算法

enter image description here

0 个答案:

没有答案