使用Keras fit_generator无法重现结果

时间:2018-08-22 18:18:21

标签: python tensorflow keras classification vgg-net

我刚刚注意到,每次运行Keras模型时,都会得到不同的结果。我基本上在GitHub上尝试过this issue 的解决方案:

  • 在导入其他内容之前先设置种子
  • shuffle=False上设置fit_generator()

即使我这样做了,我仍然似乎无法重现相同的结果。

我在刚刚链接的问题上也发布了相同的问题,但是由于能见度,我也决定在此处发布,希望有人能帮助我找出问题所在。

import numpy as np
import tensorflow as tf
import random as rn
import os
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(42)
rn.seed(12345)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
from keras import backend as K
tf.set_random_seed(1234)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

from keras.layers import Input, Dropout, Flatten, Conv2D, MaxPooling2D, Dense, Activation, Lambda,GlobalAveragePooling2D
from keras.optimizers import RMSprop , SGD, Adam,Nadam
from keras.callbacks import ModelCheckpoint, Callback, EarlyStopping, History
from keras.preprocessing.image import ImageDataGenerator
from keras.applications import VGG16, VGG19, ResNet50, Xception
from keras.models import Model

batch_size = 32
num_channels = 3
img_size = 512
img_full_size = (img_size, img_size, num_channels)
num_classes = 2
seed = 1 # for image transformations
train_path = 'keras_folders/train/'
validation_path = 'keras_folders/val/'
test_path = 'keras_folders/test/'

train_datagen = ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True)

validation_datagen = ImageDataGenerator(
    rescale=1./255)

test_datagen = ImageDataGenerator(
    rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_path,
    target_size=(img_size, img_size),
    batch_size=batch_size,
    class_mode='categorical', 
    seed=seed)

validation_generator = validation_datagen.flow_from_directory(
    validation_path,
    target_size=(img_size, img_size),
    batch_size=batch_size,
    shuffle=False,
    class_mode='categorical',
    seed=seed)

from collections import Counter
counter = Counter(train_generator.classes)
max_val = float(max(counter.values()))
class_weights = {class_id : max_val/num_images for class_id, num_images in counter.items()}  

conv_base = VGG16(weights='imagenet', include_top=False, input_shape=img_full_size)
conv_base.trainable=True
for layer in conv_base.layers[:4]:
    layer.trainable = False
x = Flatten()(conv_base.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.218)(x)
predictions = Dense(num_classes, activation='softmax')(x)
model = Model(inputs = conv_base.input , outputs=predictions)

adam = Adam(lr=0.0001)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

train_samples = train_generator.samples
validation_samples = validation_generator.samples
model.fit_generator(
    train_generator,
    class_weight=class_weights,
    steps_per_epoch= train_samples // batch_size,
    epochs=1,
    validation_data= validation_generator,
    validation_steps= validation_samples // batch_size,
    shuffle=False)

1 个答案:

答案 0 :(得分:0)

我认为您必须做相反的事情。默认情况下,fit函数的shuffle处于打开状态,而fit_generator函数则从生成器中获取shuffle。您的train_generator设置了seed参数,但没有设置shuffle参数。您的ImageDataGenerator是否有可能默认将Shuffle设置为False?

此讨论建议您在训练迭代器https://github.com/keras-team/keras/issues/2389中启用随机播放。我遇到了同样的问题,这解决了。

仅当您想完全重现给定代码段的结果时,才需要设置种子。我怀疑设置种子是否会在fit和fit_generator之间产生完全相同的结果。