为了比较不同的优化器集(SGD,Adam,Adagrad等),我试图绘制模型的整体准确性与学习率的关系。但是,当我绘制变量时,matplotlib图的结果为空
这是针对Google Colab上使用的Keras
model = Sequential()
dim = 28
nclasses = 10
model.add(Conv2D(filters=32, kernel_size=(5,5), padding='same', activation='relu', input_shape=(dim,dim,1)))
model.add(Conv2D(filters=32, kernel_size=(5,5), padding='same', activation='relu',))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters=64, kernel_size=(5,5), padding='same', activation='relu'))
model.add(Conv2D(filters=64, kernel_size=(5,5), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(120, activation='relu'))
model.add(Dense(84, activation='relu'))
model.add(Dense(nclasses, activation='softmax'))
opt = SGD(lr=0.001)
reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=25, min_lr=0.000001, verbose=1)
model.compile(optimizer=opt, loss="categorical_crossentropy", metrics=["accuracy"])
history = model.fit(x=x_train,
y=y_train,
batch_size=10,
epochs=1,
verbose=1,
callbacks=[reduce_lr],
validation_data=(x_val,y_val),
shuffle=True)
plt.plot(history.history['val_acc'])
plt.plot(history.history['lr'])
plt.title('Plot of overall accuracy to larning rate for SGD optimizer')
plt.ylabel('accuracy')
plt.xlabel('learning rate')
plt.legend(['x_train', 'x_test'], loc='upper right')
plt.show()
答案 0 :(得分:1)
您的图表为空,因为您只训练了一个纪元。但这并不是您遇到的最糟糕的问题。您正在尝试绘制时代学习率(恒定值)与时代验证准确性之间的关系。如果学习率的值恒定,您如何看待它?
您应该做的是获取用于某些模拟的学习率的值,并将其与在特定模拟过程中可以达到的最高精度值作图。例如,您使用不同的学习率训练3次,然后绘制精度与学习率的最大值,如下所示:
import matplotlib.pyplot as plt
%matplotlib inline
lrs = [history1.history['lr'][0],
history2.history['lr'][0],
history3.history['lr'][0]]
vals = [max(history1.history['val_acc']),
max(history2.history['val_acc']),
max(history3.history['val_acc'])]
lrs, vals = zip(*sorted(zip(lrs, vals)))
lrs, vals = list(lrs), list(vals)
plt.plot(lrs, vals)
plt.title('Plot of overall accuracy to larning rate for SGD optimizer')
plt.ylabel('Max Accuracy')
plt.xlabel('Learning Rate')
plt.show()
这将导致如下结果:
下面是如何定义这些模拟的示例:
opt1 = SGD(lr=0.001)
opt2 = SGD(lr=0.01)
opt3 = SGD(lr=0.1)
reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=25, min_lr=0.000001, verbose=1)
model2 = tf.keras.models.clone_model(model) # <--copy model
model3 = tf.keras.models.clone_model(model)
model.compile(optimizer=opt1, loss="categorical_crossentropy", metrics=["accuracy"])
history1 = model.fit(x=x_train,
y=y_train,
batch_size=10,
epochs=10,
verbose=1,
callbacks=[reduce_lr],
validation_data=(x_val,y_val),
shuffle=True)
model2.compile(optimizer=opt2, loss="categorical_crossentropy", metrics=["accuracy"])
history2 = model2.fit(x=x_train,
y=y_train,
batch_size=10,
epochs=10,
verbose=1,
callbacks=[reduce_lr],
validation_data=(x_val,y_val),
shuffle=True)
model3.compile(optimizer=opt3, loss="categorical_crossentropy", metrics=["accuracy"])
history3 = model3.fit(x=x_train,
y=y_train,
batch_size=10,
epochs=10,
verbose=1,
callbacks=[reduce_lr],
validation_data=(x_val,y_val),
shuffle=True)