我的机器学习模型的训练误差仍然为零吗?

时间:2020-05-05 13:39:08

标签: machine-learning scikit-learn linear-regression

我正在尝试在两个机器学习模型中绘制学习曲线。我的其中一种模型的MSE值很好。但是,在这一步中,当我绘制学习曲线时,训练误差值始终为零。我不知道我的代码或数据中是否存在错误。

我的代码源自以下情节:

prior answer

我的下面的代码:

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import learning_curve

# 80:20 ratio.
#80% train size
#20% test

train_sizes = [1, 50, 100, 150, 204]

features = ['Sex', 'Age', 'Education', 'Ideology', 'Likeability_pre-debate_AC', 'Proximity_PS', 'Proximity_Other', 'Debate Performance_AC', 'Int_Likeability_pre-debate_debate_performance_AC', 'Int_Likeability_pre-debate_Proximity_PS', 'Int_Likeability_pre-debate_Proximity_other', 'Likeability_post_debate_AC']

target = 'Likeability_post_debate_AC'

train_sizes, train_scores, validation_scores = learning_curve(estimator = LinearRegression(),X = df2[features],y = df2[target], train_sizes = train_sizes, cv = 5,scoring = 'neg_mean_squared_error')

train_scores_mean = -train_scores.mean(axis = 1)
validation_scores_mean = -validation_scores.mean(axis = 1)

plt.style.use('seaborn')
plt.plot(train_sizes, train_scores_mean, label = 'Training error')
plt.plot(train_sizes, validation_scores_mean, label = 'Validation error')
plt.ylabel('MSE', fontsize = 14)
plt.xlabel('Training set size', fontsize = 14)
plt.title('Learning curves for a linear regression model', fontsize = 18, y = 1.03)
plt.legend()
plt.ylim(-0.5,0.5)

为什么我的训练错误从一开始就保持为零?

0 个答案:

没有答案