我正在使用一组图像训练类似VGG的信号圈(如示例http://keras.io/examples/中所示)。我将图像转换为数组并使用scipy调整大小:
mapper = [] # list of photo ids
data = np.empty((NB_FILES, 3, 100, 100)).astype('float32')
i = 0
for f in onlyfiles[:NB_FILES]:
img = load_img(mypath + f)
a = img_to_array(img)
a_resize = np.empty((3, 100, 100))
a_resize[0,:,:] = sp.misc.imresize(a[0,:,:], (100,100)) / 255.0 # - 0.5
a_resize[1,:,:] = sp.misc.imresize(a[1,:,:], (100,100)) / 255.0 # - 0.5
a_resize[2,:,:] = sp.misc.imresize(a[2,:,:], (100,100)) / 255.0 # - 0.5
photo_id = int(f.split('.')[0])
mapper.append(photo_id)
data[i, :, :, :] = a_resize; i += 1
在最后一个致密层中,我有2个神经元,我用softmax激活。以下是最后一行:
model.add(Dense(2))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(data, target_matrix, batch_size=32, nb_epoch=2, verbose=1, show_accuracy=True, validation_split=0.2)
我无法改善减少损失,每个时代都有与以前相同的损失和精度。损失实际上在第1和第2纪元之间上升:
Train on 1600 samples, validate on 400 samples
Epoch 1/5
1600/1600 [==============================] - 23s - loss: 3.4371 - acc: 0.7744 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 2/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 3/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 4/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
Epoch 5/5
1600/1600 [==============================] - 23s - loss: 3.4855 - acc: 0.7837 - val_loss: 3.8280 - val_acc: 0.7625
我做错了什么?
答案 0 :(得分:3)
根据我的经验,这通常发生在学习率过高的情况下。 优化将无法找到最小值,只是转过来#34;。
理想的费率取决于您的数据和网络架构。
(作为参考,我现在正在运行一个8层的回旋网,样本大小与你的相似,并且在我将学习率降低到0.001之前,可以观察到相同的收敛性不足<) / p>
答案 1 :(得分:2)
我的建议是降低学习率,尝试数据扩充。
数据扩充代码:
print('Using real-time data augmentation.')
# this will do preprocessing and realtime data augmentation
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=True, # apply ZCA whitening
rotation_range=90, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
# fit the model on the batches generated by datagen.flow()
model.fit_generator(datagen.flow(X_train, Y_train,
batch_size=batch_size),
samples_per_epoch=X_train.shape[0],
nb_epoch=nb_epoch)