我正在尝试使用 R 中的xgb构建整个模型,以预测多个变量。因为我有太多的响应变量,所以我需要进行多变量预测以使过程更快。
我尝试使用“ cbind”。但是,当我到达Dmatrix时,它将返回错误。 我的数据是:
structure(list(X1 = c(16L, 10L, 16L, 2L, 16L, 8L, 16L, 6L, 12L,
14L), X2 = c(1.2, 1.4, 2.2, 0.3, 1.2, 1, 1.4, 0.8, 2.6, 1.8),
X3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), X4 = c(60L, 47L, 65L,
32L, 47L, 31L, 62L, 32L, 64L, 61L), X5 = c(0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 6L, 0L), X6 = c(0, 0, 0, 0, 0, 0, 0, 0, 0.6,
0), X7 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), X8 = c(69L, 58L,
50L, 51L, 57L, 59L, 25L, 76L, 10L, 45L), Y1 = c(47L, 65L,
32L, 47L, 31L, 62L, 32L, 64L, 61L, 46L), Y2 = c(58L, 50L,
51L, 57L, 59L, 25L, 76L, 10L, 45L, 48L)), row.names = c(NA,
10L), class = "data.frame")
and the code that I am using is:
#libraries
library(xgboost)
library(magrittr)
library(Matrix)
#data Partitioning
#making whatever partitions for "train and test" data
#One Hot encoding for training and testing sets:
trainm<- sparse.model.matrix(cbind(Y1,Y2)~., data = train) #to convert the
factor variables to dummy variables
train_label <- cbind(train$Y1, train$Y2) #the response variable
testm<- sparse.model.matrix(cbind(Y1,Y2)~., data = test)
test_label <- cbind(test$Y1, test$Y2)
# Matrix for xgb: dtrain and dtest, "label" is the dependent variable
dtrain <- xgb.DMatrix(trainm, label = train_label)
dtest <- xgb.DMatrix(testm, label = test_label)
#Building the model is not a problem because it just uses the dtrain.
# Check error in testing data
yhat_xg <- predict(xg_mod, dtest)
(MSE_xgb <- mean((yhat_xg - test_label)^2))
#Prediction & confusion matrix - test data
Pred_train <- predict(xg_mod, newdata = dtest, class = 'response')
pred_test <- predict(xg_mod, newdata = dtrain, class = 'response')
#Evaluation_metrics
Mertics <- data.frame(
RMSE = caret::RMSE(Pred_test, cbind(test$Y1, test$Y2)),
Rsquare = caret::R2(Pred_test, cbind(test$Y1, test$Y2)),
MAE = caret::MAE(Pred_test, cbind(test$Y1, test$Y2))
)
在(xgb.DMatrix)的步骤中,我收到一条警告消息:
(setinfo.xgb.DMatrix(dmat,names(p),p [[1]]中的错误):
标签的长度必须等于输入数据中的行数)
我希望模型能根据我的反应预测多个变量。