我在训练自动编码CNN时遇到麻烦。我的目标是以无人监督的方式对文档图像(收据,字母等)进行聚类(顺便说一下,除了自动编码器之外,您还有其他算法吗?)。
所以我尝试做一个自动编码器,而且我总是得到奇怪的解码输出,我不知道这是什么问题。我从一个没有太多压缩的非常简单的模型开始:
Layer (type) Output Shape Param #
=================================================================
conv2d_62 (Conv2D) (None, 100, 76, 16) 448
_________________________________________________________________
activation_62 (Activation) (None, 100, 76, 16) 0
_________________________________________________________________
conv2d_63 (Conv2D) (None, 50, 38, 32) 4640
_________________________________________________________________
activation_63 (Activation) (None, 50, 38, 32) 0
_________________________________________________________________
conv2d_64 (Conv2D) (None, 50, 38, 32) 9248
_________________________________________________________________
activation_64 (Activation) (None, 50, 38, 32) 0
_________________________________________________________________
up_sampling2d_26 (UpSampling (None, 100, 76, 32) 0
_________________________________________________________________
conv2d_65 (Conv2D) (None, 100, 76, 16) 4624
_________________________________________________________________
activation_65 (Activation) (None, 100, 76, 16) 0
_________________________________________________________________
up_sampling2d_27 (UpSampling (None, 200, 152, 16) 0
_________________________________________________________________
conv2d_66 (Conv2D) (None, 200, 152, 3) 435
_________________________________________________________________
activation_66 (Activation) (None, 200, 152, 3) 0
=================================================================
Total params: 19,395
Trainable params: 19,395
Non-trainable params: 0
我用少量输入(〜200个)进行了训练,因此训练很快,而且调试起来也更快。
似乎模型在20个历元和32个批处理量之后会收敛:
Epoch 1/20
4/4 [==============================] - 5s 1s/step - loss: 0.4359
Epoch 2/20
4/4 [==============================] - 5s 1s/step - loss: 0.4290
Epoch 3/20
4/4 [==============================] - 4s 904ms/step - loss: 0.4192
Epoch 4/20
4/4 [==============================] - 5s 1s/step - loss: 0.4045
Epoch 5/20
4/4 [==============================] - 3s 783ms/step - loss: 0.3886
Epoch 6/20
4/4 [==============================] - 3s 797ms/step - loss: 0.3706
Epoch 7/20
4/4 [==============================] - 5s 1s/step - loss: 0.3393
Epoch 8/20
4/4 [==============================] - 3s 777ms/step - loss: 0.3165
Epoch 9/20
4/4 [==============================] - 3s 850ms/step - loss: 0.2786
Epoch 10/20
4/4 [==============================] - 3s 780ms/step - loss: 0.2436
Epoch 11/20
4/4 [==============================] - 3s 817ms/step - loss: 0.2036
Epoch 12/20
4/4 [==============================] - 3s 771ms/step - loss: 0.1745
Epoch 13/20
4/4 [==============================] - 5s 1s/step - loss: 0.1347
Epoch 14/20
4/4 [==============================] - 3s 820ms/step - loss: 0.1150
Epoch 15/20
4/4 [==============================] - 5s 1s/step - loss: 0.1017
Epoch 16/20
4/4 [==============================] - 3s 792ms/step - loss: 0.0886
Epoch 17/20
4/4 [==============================] - 3s 789ms/step - loss: 0.0868
Epoch 18/20
4/4 [==============================] - 3s 842ms/step - loss: 0.0844
Epoch 19/20
4/4 [==============================] - 3s 762ms/step - loss: 0.0797
Epoch 20/20
4/4 [==============================] - 3s 779ms/step - loss: 0.0768
但是输出图像看起来像这样:
output of autoencoder (example)
对于损失,我使用了平均绝对误差和SGD优化器(其他算法的收敛性不是很好)。
我试图增加时期数,但损失停滞在0.07左右,并且没有下降。
我在做什么错?有任何改进的想法吗?预先感谢。
编辑:这是代码
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255, zca_whitening=False, rotation_range=0.2, width_shift_range=0.005, height_shift_range=0.005, zoom_range=0.005)
train_generator = datagen.flow_from_directory('fp_img',class_mode='input',target_size=image_dims, batch_size=batch_size,shuffle=True)
import tensorflow.keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Activation, Flatten, Input
from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D, Reshape
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
input_shape = image_rgb_dims
# Define the model
model = Sequential()
model.add(Conv2D(16, (3, 3), strides=2, padding='same', input_shape=image_rgb_dims))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3), strides=2, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(32,(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(16,(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(3,(3, 3), padding='same'))
model.add(Activation('sigmoid'))
model.summary()
# Compile the model
model.compile(optimizer='adagrad', loss='mean_absolute_error')
# Train the model
model.fit(
train_generator,
steps_per_epoch= n_images // batch_size,
epochs=20)
答案 0 :(得分:0)
一些提示:
答案 1 :(得分:0)