如何在R中使用XGB包构建多元回归模型

时间:2019-09-04 19:48:17

标签: r prediction modeling

我正在尝试使用 R 中的xgb构建整个模型,以预测多个变量。因为我有太多的响应变量,所以我需要进行多变量预测以使过程更快。

我尝试使用“ cbind”。但是,当我到达Dmatrix时,它将返回错误。 我的数据是:

structure(list(X1 = c(16L, 10L, 16L, 2L, 16L, 8L, 16L, 6L, 12L, 
14L), X2 = c(1.2, 1.4, 2.2, 0.3, 1.2, 1, 1.4, 0.8, 2.6, 1.8), 
    X3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), X4 = c(60L, 47L, 65L, 
    32L, 47L, 31L, 62L, 32L, 64L, 61L), X5 = c(0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 6L, 0L), X6 = c(0, 0, 0, 0, 0, 0, 0, 0, 0.6, 
    0), X7 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), X8 = c(69L, 58L, 
    50L, 51L, 57L, 59L, 25L, 76L, 10L, 45L), Y1 = c(47L, 65L, 
    32L, 47L, 31L, 62L, 32L, 64L, 61L, 46L), Y2 = c(58L, 50L, 
    51L, 57L, 59L, 25L, 76L, 10L, 45L, 48L)), row.names = c(NA, 
10L), class = "data.frame")

and the code that I am using is:
    #libraries
    library(xgboost)
    library(magrittr)
    library(Matrix)
    #data Partitioning 
    #making whatever partitions for "train and test" data
    #One Hot encoding for training  and testing sets:
    trainm<- sparse.model.matrix(cbind(Y1,Y2)~., data = train) #to convert the 
    factor variables to dummy variables
    train_label <- cbind(train$Y1, train$Y2) #the response variable
    testm<- sparse.model.matrix(cbind(Y1,Y2)~., data = test)
    test_label <- cbind(test$Y1, test$Y2)
    # Matrix for xgb: dtrain and dtest, "label" is the dependent variable
    dtrain <- xgb.DMatrix(trainm, label = train_label)
    dtest <- xgb.DMatrix(testm, label = test_label)
    #Building the model is not a problem because it just uses the dtrain.
    # Check error in testing data
    yhat_xg <- predict(xg_mod, dtest)
    (MSE_xgb <- mean((yhat_xg - test_label)^2))
    #Prediction & confusion matrix - test data
    Pred_train <- predict(xg_mod, newdata = dtest, class = 'response')
    pred_test <- predict(xg_mod, newdata = dtrain, class = 'response')
    #Evaluation_metrics
    Mertics <- data.frame(
      RMSE = caret::RMSE(Pred_test, cbind(test$Y1, test$Y2)),
      Rsquare = caret::R2(Pred_test, cbind(test$Y1, test$Y2)),
      MAE = caret::MAE(Pred_test, cbind(test$Y1, test$Y2))
    )

在(xgb.DMatrix)的步骤中,我收到一条警告消息:

  

(setinfo.xgb.DMatrix(dmat,names(p),p [[1]]中的错误):
    标签的长度必须等于输入数据中的行数)

我希望模型能根据我的反应预测多个变量。

0 个答案:

没有答案