我想在preProcessing
函数的内部{strong}内实现train
,并在重采样的过程中实现它(这一直是插入符号指南所推荐的)。我发现在所有preProcessing
方法之间,corr
大大改善了我的预测。但是,它不适用于所有数据集。例如,当我将其应用于具有完全相同结构的数据集时,对于其中一个数据集,对其他数据集则无效。我设法通过以下数据集产生了相同的错误,就数据的性质而言,它与我的数据集有所不同,但仍然可以再现相同的错误:
set.seed(12345)
y <- as.factor(sample(c("M", "F"),
prob = c(0.1, 0.9),
size = 10000,
replace = TRUE))
x <- rnorm(10000)
set.seed(20)
z <- rnorm(10000)
DATA <- data.frame(y, x, z)
这是我设置trainControl和train函数的方式:
set.seed(12345)
folds <- createFolds(dataSet$occurrence, k = 10,
list = TRUE, returnTrain = TRUE)
# select algorithm
algorithm <- "bayesglm"
# train parameters
set.seed(12345)
traincontrol <- trainControl(method = "loocv",
number = 10,
index = folds,
classProbs = TRUE,
summaryFunction = twoClassSummary,
savePredictions = TRUE
)
fitModel <- train(y ~ .,
data = DATA,
trControl = traincontrol,
method = algorithm,
metric = "ROC",
preProcess = c("corr")
)
这是我得到的错误:
Error in method$fit(x = x, y = y, wts = wts, param = tuneValue, lev = obsLevels, :
argument is missing, with no default
In addition: There were 11 warnings (use warnings() to see them)
Timing stopped at: 0.034 0.001 0.034
我找不到问题并修复错误。我的代码有什么问题?