我正在用python做决策树回归。但是,对应于测试样本的预测目标值将成为该叶子中目标变量的平均值。有没有一种方法不仅可以获取平均值,还可以在该存储桶中进行多元回归以获取测试样本目标变量的估计值?
P.S .:想了解python中的功能,例如:https://www.researchgate.net/publication/2640479_Employing_Linear_Regression_in_Regression_Tree_Leaves
答案 0 :(得分:0)
有没有一种方法,不仅可以获取平均值,还可以在该存储桶中进行多元回归以获取测试样本目标变量的估计值?
也许您应该使用sklearn.model_selection.cross_validate
函数进行交叉验证,该交叉验证可以给您带来很多成绩:
>>> from sklearn import datasets, linear_model
>>> from sklearn.model_selection import cross_validate
>>> from sklearn.metrics.scorer import make_scorer
>>> from sklearn.metrics import confusion_matrix
>>> from sklearn.svm import LinearSVC
>>> diabetes = datasets.load_diabetes()
>>> X = diabetes.data[:150]
>>> y = diabetes.target[:150]
>>> lasso = linear_model.Lasso()
>>> scores = cross_validate(lasso, X, y, cv=3,
... scoring=('r2', 'neg_mean_squared_error'),
... return_train_score=True)
>>> print(scores['test_neg_mean_squared_error'])
[-3635.5... -3573.3... -6114.7...]
>>> print(scores['train_r2'])
[0.28010158 0.39088426 0.22784852]