我对超级参数优化存在疑问,我无法在线查找信息。在搜索超参数时,我通过固定时期数,优化器,学习率来解决该问题,并且仅搜索神经元的数量和批处理大小,结果如下:
出于好奇,我决定向网格添加更多参数,并搜索以下超级参数:学习率,批处理数量,神经元数量,时期。我得到以下结果:
据我了解,我提供给Gridsearchcv的超参数越多,我的模型应在测试数据集上执行的越好。我希望收到任何反馈或阅读一些链接,以更好地理解问题。
数据集:班级严重失衡的时间序列(通过班级权重接近)。 RandomizedSearchCV包括交叉验证组件。该模型的示例代码为:
def create_model(activation_1='relu', activation_2='relu',
neurons_input = 1, neurons_hidden_1=1,
optimizer='adam',
input_shape=(X_train.shape[1],)):
model = Sequential()
model.add(Dense(neurons_input, activation=activation_1, input_shape=input_shape,
kernel_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42),
bias_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42)))
model.add(Dense(neurons_hidden_1, activation=activation_2,
kernel_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42),
bias_initializer=RandomNormal(mean=0.0, stddev=0.05, seed=42)))
model.add(Dense(2, activation='softmax'))
model.compile (loss = 'sparse_categorical_crossentropy', optimizer=k.optimizers.Adam(lr=1e-4))
return model
clf=KerasClassifier(build_fn=create_model, epochs=100, verbose=0)
param_grid = {
'neurons_input':[20, 25, 30, 35],
'neurons_hidden_1':[20, 25, 30, 35],
'batch_size': [32,60,80]}
class_weights = compute_class_weight('balanced', np.unique(y_train), y_train)
class_weights = dict(enumerate(class_weights))
my_cv = TimeSeriesSplit(n_splits=5).split(X_train)
rs_keras = RandomizedSearchCV(clf, param_grid, cv=my_cv, scoring='accuracy', refit='accuracy',
verbose=3, n_jobs=1,random_state=42)
rs_keras.fit(X_train, y_train, class_weight=class_weights)