神经网络(NN)时代与scikit-learn中的max_iter参数之间有什么区别或关系?
例如,从代码中可以看出,为max_iter的NN模型从1到10000进行评估,并为每次迭代评估,平均绝对误差可以看作时代?请参见下面的图片/链接!
非常感谢您!
for i in range(1,10000,10):
clf = MLPRegressor(max_iter=i, solver='lbfgs', alpha=1e-6, activation='relu', # melhorou e muito o treino com relu
hidden_layer_sizes=hidden_layer_sizes, random_state=1)
clf.fit(X_train_scaled, y_train)
mae_B = cross_val_score(clf, X_train_scaled, y_train, scoring="neg_mean_absolute_error", cv=10)
print i, float(-mae_B.mean()), clf.score(X_train_scaled, y_train), clf.score(X_test_scaled, y_test)
答案 0 :(得分:1)
max_iter
等于您希望模型接受训练的最大时期数。之所以称为最大值是因为学习也可以在达到最大迭代次数之前根据其他终止条件-n_iter_no_change
停止。因此,请勿循环使用不同的max_iterations,如果要避免过度拟合,请尝试调整tol
和n_iter_no_change
。
尝试以下操作,并在max_iter
中设置足够足够的纪元,然后再玩n_iter_no_change
和tol
。参考Doc
clf = MLPRegressor(max_iter=50, solver='lbfgs', alpha=1e-6, activation='relu',
hidden_layer_sizes=hidden_layer_sizes, random_state=1,
tol=1e-3, n_iter_no_change = 5)
clf.fit(X_train_scaled, y_train)
mae_B = cross_val_score(clf, X_train_scaled, y_train, scoring="neg_mean_absolute_error", cv=10)
print i, float(-mae_B.mean()), clf.score(X_train_scaled, y_train), clf.score(X_test_scaled, y_test)