Question

有人可以解释Cover R包中的xgboost列是如何在xgb.model.dt.tree函数中计算的？

在文档中，它说Cover ＆＃34;是衡量受分割影响的观察数量的度量标准。

当您运行此函数的xgboost文档中给出的以下代码时，树0的节点0的Cover为1628.2500。

data(agaricus.train, package='xgboost')

#Both dataset are list with two items, a sparse matrix and labels
#(labels = outcome column which will be learned).
#Each column of the sparse Matrix is a feature in one hot encoding format.
train <- agaricus.train

bst <- xgboost(data = train$data, label = train$label, max.depth = 2,
               eta = 1, nthread = 2, nround = 2,objective = "binary:logistic")

#agaricus.test$data@Dimnames[[2]] represents the column names of the sparse matrix.
xgb.model.dt.tree(agaricus.train$data@Dimnames[[2]], model = bst)

火车数据集中有6513个观测值，所以有人可以解释为什么{0}树0的节点0的Cover是这个数字的四分之一（1628.25）？

此外，树1的节点1的Cover是788.852 - 这个数字是如何计算的？

非常感谢任何帮助。感谢。

Answer 1

封面在xgboost中定义为：

分类为训练数据的二阶梯度之和叶子，如果是方形损失，这只是对应的数量该分支中的实例。在树中更深一个节点，降低它指标将是

https://github.com/dmlc/xgboost/blob/f5659e17d5200bd7471a2e735177a81cb8d3012b/R-package/man/xgb.plot.tree.Rd 没有特别好记录......

为了计算覆盖率，我们需要知道树中该点的预测，以及与损失函数相关的二阶导数。

幸运的是，您示例中0-0节点中每个数据点（其中6513个）的预测值为.5。这是一个全局默认设置，您在t = 0时的第一个预测是.5。

base_score [默认= 0.5]所有的初始预测分数实例，全球偏见

http://xgboost.readthedocs.org/en/latest/parameter.html

二元逻辑的梯度（这是你的目标函数）是p-y，其中p =你的预测，y =真正的标签。

因此， hessian （我们需要它）是p *（1-p）。 注意：可以在没有y的情况下确定Hessian，即真正的标签。

所以（把它带回家）：

6513 *（。5）*（1 - .5）= 1628.25

在第二棵树中，那时的预测不再是全部.5，sp让我们在一棵树之后得到预测

p = predict(bst,newdata = train$data, ntree=1)

head(p)
[1] 0.8471184 0.1544077 0.1544077 0.8471184 0.1255700 0.1544077

sum(p*(1-p))  # sum of the hessians in that node,(root node has all data)
[1] 788.8521

注意，对于线性（平方误差）回归，hessian总是一个，所以封面表示该叶子中有多少个例子。

最重要的是，封面是由目标函数的粗糙定义的。在获得梯度和二元逻辑函数的粗麻布方面有很多信息。

这些幻灯片有助于了解为什么他使用hessians作为加权，并解释xgboost如何与标准树分开。 https://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf

如何计算xgboost封面？

1 个答案: