我应该使用什么而不是Bootstrap?

时间:2015-01-19 17:25:07

标签: python scikit-learn

当我运行此代码时:

from sklearn import cross_validation
bs = cross_validation.Bootstrap(9, random_state=0)

我收到了这个弃用警告:

C:\Anaconda\envs\p33\lib\site-packages\sklearn\cross_validation.py:684: DeprecationWarning: Bootstrap will no longer be supported as a cross-validation method as of version 0.15 and will be removed in 0.17
  "will be removed in 0.17", DeprecationWarning)

我应该使用什么而不是引导程序?

3 个答案:

答案 0 :(得分:5)

来自the scikit-learn 0.15 release notes," API更改摘要"

  

来自the source code itself

# See, e.g., http://youtu.be/BzHz0J9a6k0?t=9m38s for a motivation
# behind this deprecation
warnings.warn("Bootstrap will no longer be supported as a " +
              "cross-validation method as of version 0.15 and " +
              "will be removed in 0.17", DeprecationWarning)

答案 1 :(得分:1)

您可以使用BaggingClassifier

bag = BaggingClassifier(base_estimator=your_estimator, 
                        n_estimators=100,
                        max_samples=1.0,
                        bootstrap=True,
                        n_jobs=-1)
bag.fit(X, y)
recalls = []
for estimator, samples in zip(bag.estimators_, bag.estimators_samples_):
    # compute predictions on out-of-bag samples
    mask = ~samples
    y_pred = estimator.predict(X[mask])
    # compute some statistic
    recalls.append(recall(y[mask], y_pred))
# Do something with stats, e.g. find confidence interval
print(np.percentile(recalls, [2.5, 97.5]))

答案 2 :(得分:0)

我刚刚遇到了这个问题,我找到的解决方案(从 scikit-learn 0.24 开始)是使用 resample 实用程序。

from sklearn.utils import resample

这将在每次调用时生成 1 个引导步骤,使用带替换采样的默认参数。

https://scikit-learn.org/stable/modules/generated/sklearn.utils.resample.html