我仅在使用某些数据集时遇到此问题。当我使用以下输入数据时,结果看起来很好
str(trainDataFrame, list.len = ncol(trainDataFrame))
'data.frame': 486 obs. of 173 variables:
$ snaive : int
$ arima : num
$ ets : num
$ stl : num
$ tsAverage : num
$ horizon : Factor w/ 12 levels
$ OpenLag1 : num
$ OpenLag2 : num
$ OpenLag3 : num
$ CloseLag1 : num
$ CloseLag2 : num
$ CloseLag3 : num
$ US.HR.RecruitingLag1 : int
$ US.HR.RecruitingLag2 : int
$ US.HR.RecruitingLag3 : int
$ US.Employment.rateLag1 : num
$ US.Employment.rateLag2 : num
$ US.Employment.rateLag3 : num
$ Services.Person.HireLag1 : int
$ Services.Person.HireLag2 : int
$ Services.Person.HireLag3 : int
$ target : num
$ trend : int
$ season : Factor w/ 13 levels
$ numericIndex : num
$ arima_ets : num
$ arima_stl : num
$ ets_stl : num
$ arima_ets_stl : num
$ arima_ets_snaive : num
$ arima_stl_snaive : num
$ ets_stl_snaive : num
但是,当我使用以下输入数据时,会收到chr作为输出预测
str(trainDataFrame)
'data.frame': 234 obs. of 46 variables:
$ snaive : num
$ arima : num
$ ets : num
$ tsAverage : num
$ horizon : Factor w/ 12 levels
$ HiPoLag1 : num
$ HiPoLag2 : num
$ HiPoLag3 : num
$ Calendar.DaysLag1 : int
$ Calendar.DaysLag2 : int
$ Calendar.DaysLag3 : int
$ Consumption.DaysLag1 : int
$ Consumption.DaysLag2 : int
$ Consumption.DaysLag3 : int
$ target : num
$ trend : int
$ season : Factor w/ 13 levels
$ numericIndex : num
$ arima_ets : num
这是来自第二输入数据的结果。请注意,ranger $ pred $ pred是char而不是num。
$ ranger :List of 23
..$ method : chr "ranger"
..$ modelInfo :List of 15 ...
..$ modelType : chr "Regression"
..$ results :'data.frame': 5 obs. of 5 variables: ...
..$ pred :'data.frame': 720 obs. of 5 variables:
.. ..$ pred : chr [1:720]
.. ..$ obs : num [1:720]
.. ..$ rowIndex: int [1:720] 102 102 102 102 102 101 114 101 114 101 ...
.. ..$ mtry : num [1:720] 3 14 26 37 49 3 3 14 14 26 ...
.. ..$ Resample: chr [1:720] "Training02" "Training02" "Training02"
如果需要查看代码,以下是我用来调用两个数据集的训练函数的代码
trControl = list(verboseIter = TRUE)
trControl <- c(list(index = cvindexes[["cvtrainidx"]],
indexOut = cvindexes[["cvtestidx"]],
savePredictions = "all"),
trControl)
caretTrainControl <- do.call(caret::trainControl, trControl)
trainedModels <- lapply(
mlParams,
function(x) do.call(caret::train, c(list(form = target ~ .,
data = trainDataFrame,
trControl = caretTrainControl),
x))
)
两种情况下均使用相同的mlParam。请看下面。
$knn
$knn$method
[1] "knn"
$knn$tuneGrid
k
1 1
2 2
3 3
...
20 20
$knn$metric
[1] "RMSE"
$knn$preProcess
[1] "zv" "knnImpute" "center" "scale"
$glmnet
$glmnet$method
[1] "glmnet"
$glmnet$tuneLength
[1] 50
$glmnet$metric
[1] "RMSE"
$glmnet$preProcess
[1] "zv" "knnImpute" "center" "scale"
$svmRadial
$svmRadial$method
[1] "svmRadial"
$svmRadial$tuneGrid
C sigma
1 10 1e-05
2 100 1e-05
3 1000 1e-05
...
12 1000 1e-02
$svmRadial$metric
[1] "RMSE"
$svmRadial$preProcess
[1] "zv" "knnImpute" "center" "scale"
$xgbTree
$xgbTree$method
[1] "xgbTree"
$xgbTree$tuneGrid
nrounds max_depth eta gamma colsample_bytree min_child_weight
1 1 2 0.005 0 0.3 1
2 2 2 0.005 0 0.3 1
3 3 2 0.005 0 0.3 1
...
900 100 6 0.005 0 0.7 1
$xgbTree$nthread
[1] 1
$xgbTree$metric
[1] "RMSE"
$xgbTree$preProcess
[1] "zv" "knnImpute" "center" "scale"
我不明白为什么对于第二个数据集,Ranger $ pred $ pred导致chr而不是num。有没有人经历过或者知道发生了什么?预先谢谢您!