我目前正在尝试使用新的xgboostExplainer
软件包。
我在这里关注githib页面https://github.com/AppliedDataSciencePartners/xgboostExplainer/blob/master/R/explainPredictions.R
在第34行,运行xgboost模型:
xgb.model <- xgboost(param =param, data = xgb.train.data, nrounds=3)
然而,在第43行,我遇到了一些问题。
explainer = buildExplainer(xgb.model,xgb.train.data, type="binary", base_score = 0.5, n_first_tree = xgb.model$best_ntreelimit - 1)
我了解n_first_tree
已弃用,但似乎无法访问xgb.model$best_ntreelimit -1
部分。
我可以在xgboost中访问的部分是;
handle, raw, niter, evaluation_log, call, params, callbacks, feature_names
不是best_ntreelimit
是否有其他人遇到过这个问题。
编辑:
showWaterfall()
Extracting the breakdown of each prediction...
|=============================================================| 100%
DONE!
Prediction: NA
Weight: NA
Breakdown
intercept cap-shape=bell
NA NA
cap-shape=conical cap-shape=convex
NA NA
cap-shape=flat cap-shape=knobbed
NA NA
cap-shape=sunken cap-surface=fibrous
NA NA
cap-surface=grooves cap-surface=scaly
NA NA
cap-surface=smooth cap-color=brown
NA NA
cap-color=buff cap-color=cinnamon
NA NA
cap-color=gray cap-color=green
NA NA
cap-color=pink cap-color=purple
NA NA
cap-color=red cap-color=white
NA NA
cap-color=yellow bruises?=bruises
NA NA
bruises?=no odor=almond
NA NA
odor=anise odor=creosote
NA NA
odor=fishy odor=foul
NA NA
odor=musty odor=none
NA NA
odor=pungent odor=spicy
NA NA
gill-attachment=attached gill-attachment=descending
NA NA
gill-attachment=free gill-attachment=notched
NA NA
gill-spacing=close gill-spacing=crowded
NA NA
gill-spacing=distant gill-size=broad
NA NA
gill-size=narrow gill-color=black
NA NA
gill-color=brown gill-color=buff
NA NA
gill-color=chocolate gill-color=gray
NA NA
gill-color=green gill-color=orange
NA NA
gill-color=pink gill-color=purple
NA NA
gill-color=red gill-color=white
NA NA
gill-color=yellow stalk-shape=enlarging
NA NA
stalk-shape=tapering stalk-root=bulbous
NA NA
stalk-root=club stalk-root=cup
NA NA
stalk-root=equal stalk-root=rhizomorphs
NA NA
stalk-root=rooted stalk-root=missing
NA NA
stalk-surface-above-ring=fibrous stalk-surface-above-ring=scaly
NA NA
stalk-surface-above-ring=silky stalk-surface-above-ring=smooth
NA NA
stalk-surface-below-ring=fibrous stalk-surface-below-ring=scaly
NA NA
stalk-surface-below-ring=silky stalk-surface-below-ring=smooth
NA NA
stalk-color-above-ring=brown stalk-color-above-ring=buff
NA NA
stalk-color-above-ring=cinnamon stalk-color-above-ring=gray
NA NA
stalk-color-above-ring=orange stalk-color-above-ring=pink
NA NA
stalk-color-above-ring=red stalk-color-above-ring=white
NA NA
stalk-color-above-ring=yellow stalk-color-below-ring=brown
NA NA
stalk-color-below-ring=buff stalk-color-below-ring=cinnamon
NA NA
stalk-color-below-ring=gray stalk-color-below-ring=orange
NA NA
stalk-color-below-ring=pink stalk-color-below-ring=red
NA NA
stalk-color-below-ring=white stalk-color-below-ring=yellow
NA NA
veil-type=partial veil-type=universal
NA NA
veil-color=brown veil-color=orange
NA NA
veil-color=white veil-color=yellow
NA NA
ring-number=none ring-number=one
NA NA
ring-number=two ring-type=cobwebby
NA NA
ring-type=evanescent ring-type=flaring
NA NA
ring-type=large ring-type=none
NA NA
ring-type=pendant ring-type=sheathing
NA NA
ring-type=zone spore-print-color=black
NA NA
spore-print-color=brown spore-print-color=buff
NA NA
spore-print-color=chocolate spore-print-color=green
NA NA
spore-print-color=orange spore-print-color=purple
NA NA
spore-print-color=white spore-print-color=yellow
NA NA
population=abundant population=clustered
NA NA
population=numerous population=scattered
NA NA
population=several population=solitary
NA NA
habitat=grasses habitat=leaves
NA NA
habitat=meadows habitat=paths
NA NA
habitat=urban habitat=waste
NA NA
habitat=woods
NA
-3.89182 -3.178054 -2.751535 -2.442347 -2.197225 -1.99243 -1.81529 -1.658228 -1.516347 -1.386294 -1.265666 -1.15268 -1.045969 -0.9444616 -0.8472979 -0.7537718 -0.6632942 -0.5753641 -0.4895482 -0.4054651 -0.3227734 -0.2411621 -0.1603427 -0.08004271 0 0.08004271 0.1603427 0.2411621 0.3227734 0.4054651 0.4895482 0.5753641 0.6632942 0.7537718 0.8472979 0.9444616 1.045969 1.15268 1.265666 1.386294 1.516347 1.658228 1.81529 1.99243 2.197225 2.442347 2.751535 3.178054 3.89182
Error in if (abs(values[i]) > put_rect_text_outside_when_value_below) { :
missing value where TRUE/FALSE needed
编辑:这是我运行的代码:
library(xgboost)
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
xgb.train.data <- xgb.DMatrix(train$data, label = train$label)
xgb.test.data <- xgb.DMatrix(test$data, label = test$label)
param <- list(objective = "binary:logistic")
model.cv <- xgb.cv(param = param,
data = xgb.train.data,
nrounds = 500,
early_stopping_rounds = 10,
nfold = 3)
model.cv$best_ntreelimit
xgb.model <- xgboost(param =param, data = xgb.train.data, nrounds = 10)
explained <- buildExplainer(xgb.model, xgb.train.data, type="binary", base_score = 0.5, n_first_tree = 9)
pred.breakdown = explainPredictions(xgb.model,
explained,
xgb.test.data)
showWaterfall(xgb.model,
explained,
xgb.test.data, test$data, 2, type = "binary")
答案 0 :(得分:0)
我测试了链接页面中的代码。
best_ntreelimit
是xgb.cv
设置early_stopping_rounds
时返回的参数。在xgb.cv
的帮助下:
best_ntreelimit对应最佳的ntreelimit值 迭代,可以进一步用于预测方法(仅 提前停车)。
你可以使用xgb.cv来实现它:
library(xgboost)
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
xgb.train.data <- xgb.DMatrix(train$data, label = train$label)
param <- list(objective = "binary:logistic")
model.cv <- xgb.cv(param = param,
data = xgb.train.data,
nrounds = 500,
early_stopping_rounds = 10,
nfold = 3)
model.cv$best_ntreelimit
#output
9
但是xgb.cv的输出不能用于构建解释器。
所以你需要:
xgb.model <- xgboost(param =param, data = xgb.train.data, nrounds = 10)
并将n_first_tree
设置为整数:
explained <- buildExplainer(xgb.model, xgb.train.data, type="binary", base_score = 0.5, n_first_tree = 9)
编辑:我未能粘贴以下代码:
xgb.test.data <- xgb.DMatrix(test$data, label = test$label)
pred.breakdown = explainPredictions(xgb.model,
explained,
xgb.test.data)
现在你可以这样做:
showWaterfall(xgb.model,
explained,
xgb.test.data, test$data, 2, type = "binary")