数据框架为xgboost做准备

时间:2017-10-17 15:56:56

标签: r xgboost data-processing

我有外部回归量的数据框 - x和因变量的向量 - 响应。我想训练xgboost模型。我应该在xgboost函数中添加什么标签?或者我构建合适输入的方式是错误的?

g <- data.frame(target = response,x)
sm <-sparse.model.matrix(target ~., g) 
fit <- xgboost (data = sm, 
                label = , 
                eta = 0.1,
                max_depth = 15, 
                nround=25, 
                subsample = 0.5,
                colsample_bytree = 0.5,
                seed = 1,
                eval_metric = "merror",
                objective = "reg:linear",
                num_class = 12,
               nthread = 3
    )

提前致谢!

1 个答案:

答案 0 :(得分:0)

# X_train is train samples,, y_train is train label
# X_test is test samples
# this function ,you nedd to write by youself!
X_train, y_train = featureSet(data)
X_test = loadTestData(testFilePath)

dtrain = xgb.DMatrix(X_train, y_train)
num_rounds = 300
plst = params.items()
model = xgb.train(plst, dtrain, num_rounds)

# 对测试集进行预测
dtest = xgb.DMatrix(X_test)
ans = model.predict(dtest)