Question

我正在使用Python，Tensorflow和Keras在450x450 rgb前向手表图像（例如watch_1）上运行自动编码器。我的目标是使用由自动编码器生成的这些图像的编码表示，并比较它们以找到其中最相似的手表。目前，我使用的是1500 rgb图像，因为我还没有GPU但只有一台带有26BG RAM的电脑。

我的源代码如下：

from keras.layers import Input, Dense
from keras.models import Model
import cv2
import numpy as np
from sklearn import preprocessing
from glob import glob
import sys

data = []
number = 1500
i = 0
for filename in glob('Watches/*.jpg'):
    img = cv2.imread(filename)
    height, width, channels = img.shape

    # Transpose images to one line
    if height == 450 and width == 450:
        img = np.concatenate(img, axis=0)
        img = np.concatenate(img, axis=0)
        data.append(img)
    else:
        print('These are not the correct dimensions')

    i = i + 1
    if i > number:
        break

# Normalise data
data = np.array(data)
Norm = preprocessing.Normalizer()
Norm.fit(data)
data = Norm.transform(data)

# Size of our encoded representations
encoding_dim = 250

# Input placeholder
input_img = Input(shape=(width * height * channels,))
# Encoded representation of the input
encoded = Dense(encoding_dim, activation='relu')(input_img)
# Lossy reconstruction of the input
decoded = Dense(width * height * channels, activation='sigmoid')(encoded)

# Autoencoder model in all
autoencoder = Model(input_img, decoded)

# Compile the model
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy', metrics=['accuracy'])
print(autoencoder.summary())

# Train the model
length = len(data)
data_train = data[:int(0.7*length)]
data_test = data[(int(0.7*length) + 1):]

autoencoder.fit(data_train, data_train, epochs=10, batch_size=50, shuffle=True, validation_data=(data_test, data_test))

我简要地得到了以下结果：

Epoch 1/10
loss: 0.6883 - acc: 0.0015 - val_loss: 0.6883 - val_acc: 0.0015

Epoch 2/10
loss: 0.6883 - acc: 0.0018 - val_loss: 0.6883 - val_acc: 0.0018

# I omit the other epochs for the sake of brevity  

Epoch 10/10
loss: 0.6883 - acc: 0.0027 - val_loss: 0.6883 - val_acc: 0.0024

准确度非常低。

这是因为我使用了相对较少的图片，还是因为我的源代码存在问题？

如果问题是图像的数量，则需要多少图像才能具有精确度＆gt; 80％？

Answer 1

所以我想在阅读你评论的博客文章后详细说明我的答案。您的实现实际上是正确的，但您不希望单独评估自动编码器。

自动编码器被认为是降维过程，因此无论自动编码器生成什么输出，总是有损耗的。您可以通过将自动编码器作为一个层添加到实际进行分类的神经网络来评估自动编码器的工作情况。会发生什么是有损表示成为后续神经网络的“输入”。在此后续神经网络中，您希望使用softmax激活作为最后一层。然后，您可以评估NN的准确性。

将自动编码器视为降维的预处理步骤，类似于主成分分析。

model = Sequential()
model.add(autoencoder.layers[1])  # here is where you add your autoencoder
model.add(Dense(10, activation='softmax'))  # assumes 10 watch classes
model.compile(optimizer='adadelta', loss='categorical_crossentropy', metrics=['accuracy'])

自动编码器：准确度和图像数量

1 个答案: