我使用xgboost R包执行多类分类任务。 这是我创建的一段代码来说明问题(输入和输出是随机生成的,因此结果当然没有意义,它是我为了解决方案而学习如何处理包的事情):
require(xgboost)
# First of all I set some parameters
featureNumber = 5
num_class = 4
obs = 1000
# I declare a function that I will use to generate my categorical labels
generateLabels <- function(x,num_class){
label <- 0
if(runif(1,min=0,max =1) <0.1){
label <- 0
}else{
label <- which.max(x) -1
foo <- runif(1,min=0,max =1)
if(foo > 0.9){label <- label + 1}
if(foo < 0.1){label <- label - 1}
}
return(max(min(label,num_class-1),0))
}
# I generate a random train set and his labels
features <- matrix(runif(featureNumber*obs, 1, 10), ncol = featureNumber)
labels <- apply(features, 1, generateLabels,num_class = num_class)
dTrain <- xgb.DMatrix(data = features, label = labels)
# I generate a random test set and his labels
testObs = floor(obs*0.25)
featuresTest <- matrix(runif(featureNumber*testObs, 1, 10), ncol = featureNumber)
labelsTest <- apply(featuresTest, 1, generateLabels, num_class = num_class)
dTest <- xgb.DMatrix(data = featuresTest, label = labelsTest)
# I train the
xgbm <- xgb.train(data = dTrain,
nrounds = 10,
objective = "multi:softprob",
eval_metric = "mlogloss",
watchlist = list(train=dTrain, eval=dTest),
num_class = featureNumber)
这可以按预期工作并产生预期的结果,这里有几行:
[0] train-mlogloss:1.221495 eval-mlogloss:1.292785
[1] train-mlogloss:0.999905 eval-mlogloss:1.121077
[2] train-mlogloss:0.846809 eval-mlogloss:1.014519
[3] train-mlogloss:0.735182 eval-mlogloss:0.942461
[4] train-mlogloss:0.650207 eval-mlogloss:0.891341
[5] train-mlogloss:0.580136 eval-mlogloss:0.851774
[6] train-mlogloss:0.524390 eval-mlogloss:0.827973
[7] train-mlogloss:0.475884 eval-mlogloss:0.815081
[8] train-mlogloss:0.435342 eval-mlogloss:0.799799
[9] train-mlogloss:0.402307 eval-mlogloss:0.789209
我无法实现的是存储这些值以便以后使用它们。是否有可能做到这一点?调整参数非常有用。
P.S。我知道我可以使用包中包含的交叉验证方法xgb.cv来获得类似的结果;但我宁愿使用这种方法来控制所发生的事情,因为这些指标都是计算出来的,在我看来,浪费了计算能力,除了在屏幕上阅读之外不能使用它们。
答案 0 :(得分:0)
您可以使用xbgm$bestScore
和xbgm$bestInd