XGBoost需要数字矩阵作为输入数据,数字矢量作为其标签。但是,我仍然收到"无效的输入数据"和"标签将被忽略"作为我的错误消息。下面附有代码。我有没有办法输入数字矩阵作为输入数据和/或数字向量作为标签?
# Re-factor target column
#Attempting to put numeric vector as label -- will this work tho....
train$NAME_EDUCATION_TYPE <- as.numeric(factor(train$NAME_EDUCATION_TYPE , labels = c(1:5)))
test$NAME_EDUCATION_TYPE <- as.numeric(factor(test$NAME_EDUCATION_TYPE , labels = c(1:5)))
# Replace NAs with median
train$AMT_ANNUITY[is.na(train$AMT_ANNUITY)] <- with(train, ave(AMT_ANNUITY, FUN = function(x) median(x, na.rm = TRUE)))[is.na(train$AMT_ANNUITY)]
train$EXT_SOURCE_1[is.na(train$EXT_SOURCE_1)] <- with(train, ave(EXT_SOURCE_1, FUN = function(x) median(x, na.rm = TRUE)))[is.na(train$EXT_SOURCE_1)]
train$EXT_SOURCE_2[is.na(train$EXT_SOURCE_2)] <- with(train, ave(EXT_SOURCE_2, FUN = function(x) median(x, na.rm = TRUE)))[is.na(train$EXT_SOURCE_2)]
train$EXT_SOURCE_3[is.na(train$EXT_SOURCE_3)] <- with(train, ave(EXT_SOURCE_3, FUN = function(x) median(x, na.rm = TRUE)))[is.na(train$EXT_SOURCE_3)]
# Find percentages of NAs
lapply(1:dim(train)[2], function(i) {
data.frame(
colnames(train)[i],
sum(is.na(train[,i]))) / dim(train)[1]
}
) %>% bind_rows()
#---Checks data types of train
str(train_raw)
#-----XGBOOST Model
#Current error messages: invalid input data, label will be ignored
xgb_model <- xgboost(data = suppressWarnings(as.numeric(as.matrix(train))),
label = train_raw$TARGET,
nrounds = 5,
objective = "binary:logistic",
params = list(
booster = "gblinear",
eta = 0.05,
lambda = 1,
lambda_bias = 1,
gamma = 1,
early_stopping_rounds = 3,
eval_metric = "rmse")
)
任何帮助解决&#34;无效的输入数据&#34;和&#34;标签将被忽略&#34;错误将非常感激。