我正在为二进制选择预测构建XGBoost模型。但是,我在生成预测时遇到了麻烦。如何从这段代码的结尾转到对测试数据的实际预测? 我的代码有7个自变量和一个因变量,这是二进制选择。
choice <- dataset_training$choiceprobX
set.seed(1234)
ind <- sample(2, nrow(dataset_training), replace=TRUE, prob=c(0.67, 0.33))
training <- as.matrix(dataset_training[ind==1, 1:7])
head(training)
testing <- as.matrix(dataset_training[ind==2, 1:7])
head(testing)
dataset_trainLabel <- dataset_training[ind==1, 8]
head(dataset_trainLabel)
dataset_testLabel <- dataset_training[ind==2, 8]
head(dataset_testLabel)
xgb.train <- xgb.DMatrix(data=training,label=dataset_trainLabel)
xgb.test <- xgb.DMatrix(data=testing,label=dataset_testLabel)
params = list(
booster="gbtree",
eta=0.01,
max_depth=5,
gamma=3,
subsample=0.75,
colsample_bytree=1,
objective="binary:logistic",
eval_metric="logloss"
)
xgb.fit=xgb.train(
params=params,
data=xgb.train,
nrounds=10,
nthreads=1,
early_stopping_rounds=10,
watchlist=list(val1=xgb.train,val2=xgb.test),
verbose=0
)
xgb.fit
我的目标是生成一个混淆矩阵,但是当我这样做时,它告诉我数据和参考必须是同一水平的因子。
答案 0 :(得分:0)
由于我没有您的数据,让我们使用示例数据集虹膜:
set.seed(100)
data = iris
data$Species = as.numeric(data$Species=="versicolor")
idx = sample(nrow(data),100)
dtrain <- xgb.DMatrix(as.matrix(data[idx,-5]), label = data$Species[idx])
dtest <- xgb.DMatrix(as.matrix(data[-idx,-5]), label = data$Species[-idx])
param <- list(max_depth = 2, eta = 1, verbose = 0, nthread = 2,
objective = "binary:logistic", eval_metric = "logloss")
xgb.fit <- xgb.train(param, dtrain, nrounds = 10, watchlist)
要创建混淆矩阵,我们可以将预测值转换为0和1(基于概率> 0.5),然后将表格传递给confusionMatrix函数:
library(caret)
pred = as.numeric(predict(xgb.fit,dtest) >0.5)
obs = getinfo(dtest, "label")
confusionMatrix(table(pred,obs))
Confusion Matrix and Statistics
obs
pred 0 1
0 34 0
1 1 15
Accuracy : 0.98
95% CI : (0.8935, 0.9995)
No Information Rate : 0.7
P-Value [Acc > NIR] : 4.034e-07
Kappa : 0.9533
Mcnemar's Test P-Value : 1
Sensitivity : 0.9714
Specificity : 1.0000
Pos Pred Value : 1.0000
Neg Pred Value : 0.9375
Prevalence : 0.7000
Detection Rate : 0.6800
Detection Prevalence : 0.6800
Balanced Accuracy : 0.9857
'Positive' Class : 0