Question

我想知道是否存在最佳实践，即按时间对随机森林的训练样本进行指数加权（将更多的权重放在最近的样本上）？我能想到的一种方法是根据给定的权重对整个数据集进行替换采样。我还应该考虑其他方法吗？如果有人知道一些可以帮助我实现此目标的python软件包，那就太好了。任何帮助深表感谢！

Answer 1

随机森林的sklearn实现允许在fit function中指定样本权重。

from sklearn.ensemble import RandomForestClassifier
# fill sample_weight with the desired weighting
sample_weights = numpy.ones(y.shape)
estimator = RandomForestClassifier
estimator.fit(X, y, sample_weights)

使用随机森林时的时间加权样本

1 个答案: