将cross_validation算法转换为model_selection

时间:2018-08-02 12:57:38

标签: python scikit-learn cross-validation sklearn-pandas

2016年,我使用以下代码运行了套索回归模型:

Cd --> "SCOR"
Issr --> Payee/Name"
Ref - > DocumentNumber/UniqueRemittanceIdentifier/Number or 
DocumentNumber/ReferenceNumber

现在我想再次运行它,并收到以下警告:

  

DeprecationWarning:此模块在版本0.18中已弃用   支持将所有重构到的model_selection模块   类和函数已移动。

如何使用#Import required packages import pandas as pd import numpy as np import matplotlib as mpl import matplotlib.pylab as plt import matplotlib.pyplot as plp import seaborn as sns import statsmodels.formula.api as smf from scipy import stats from sklearn.cross_validation import train_test_split from sklearn.linear_model import LassoLarsCV # split data into train and test sets pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, target, test_size=.4, random_state=123) #% # specify the lasso regression model model=LassoLarsCV(cv=10, precompute=False).fit(pred_train,tar_train) #% # print variable names and regression coefficients dict(zip(predictors.columns, model.coef_)) #regcoef.to_csv('variable+regresscoef.csv') #%% # plot coefficient progression m_log_alphas = -np.log10(model.alphas_) ax = plt.gca() plt.plot(m_log_alphas, model.coef_path_.T) plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k', label='alpha CV') plt.ylabel('Regression Coefficients') plt.xlabel('-log(alpha)') plt.title('Regression Coefficients Progression for Lasso Paths') #% # plot mean square error for each fold m_log_alphascv = -np.log10(model.cv_alphas_) plt.figure() plt.plot(m_log_alphascv, model.cv_mse_path_, ':') plt.plot(m_log_alphascv, model.cv_mse_path_.mean(axis=-1), 'k', label='Average across the folds', linewidth=2) plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k', label='alpha CV') plt.legend() plt.xlabel('-log(alpha)') plt.ylabel('Mean squared error') plt.title('Mean squared error on each fold') #% # MSE from training and test data from sklearn.metrics import mean_squared_error train_error = mean_squared_error(tar_train, model.predict(pred_train)) test_error = mean_squared_error(tar_test, model.predict(pred_test)) print ('training data MSE') print(train_error) print ('test data MSE') print(test_error) #% # R-square from training and test data rsquared_train=model.score(pred_train,tar_train) rsquared_test=model.score(pred_test,tar_test) print ('training data R-square') print(rsquared_train) print ('test data R-square') print(rsquared_test) 重写此代码?

1 个答案:

答案 0 :(得分:2)

我在这里只能看到先前使用过cross_validation模块的情况是train_test_split

所以只需更改您的导入来源:

from sklearn.cross_validation import train_test_split

收件人:

from sklearn.model_selection import train_test_split

你很好。