我正在尝试使用Caret软件包中的train()
函数来拟合K最近邻居模型。这给我一个错误。我的代码是:
"%+%" <- function(x,y) paste(x, y, sep = "")
set.seed(28)
ContEnt <- trainControl(method = "repeatedcv", number = 10, repeats = 3)
EducKnn <- train(as.formula("pp04b_cod ~ " %+% paste(VarEduc[!VarEduc %in% NoRel], collapse
= " + ")), EducPrueba, method = "knn", trctrl = ContEnt,
tuneLength = 10)
这给了我回报:
Warning: predictions failed for Resample01: k= 5 Error in knn3Train(train = structure(c(0.569069692629571, 0.569069692629571, :
unused argument (trctrl = list("cv", 10, NA, "grid", 0.75, NULL, 1, TRUE, 0, FALSE, TRUE, "final", FALSE, FALSE, function (data, lev = NULL, model = NULL)
{
if (is.character(data$obs)) data$obs <- factor(data$obs, levels = lev)
postResample(data[, "pred"], data[, "obs"])
}, "best", list(0.95, 3, 5, 19, 10, 0.9), NULL, NULL, NULL, NULL, 0, c(FALSE, FALSE), NA, list(5, 0.05, "gls", TRUE), FALSE, TRUE))
许多类似的警告信息,最后:
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
trainInfo, :
There were missing values in resampled performance measures.
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :10 NA's :10
Error: Stopping
>
设置参数以避免交叉验证,似乎并不能解决问题。
ContEnt <- trainControl(method = "none")
此外,在train()中设置na.action = na.omit也会得到相同的结果。有趣的是,该类程序包的knn()函数在.75训练集上对相同的变量集可以完美地工作。
Entre <- createDataPartition(EducPrueba$pp04b_cod, 1, 0.75, list = FALSE)
EducKnn <- knn(train = EducPrueba[Entre, VarEduc[!VarEduc %in% NoRel]], test = EducPrueba[-Entre,
VarEduc[!VarEduc %in% NoRel]], cl = EducPrueba$pp04b_cod[Entre], k = 5)
最后,可以肯定的是,EducPrueba没有NA或NaN:
> any(is.na(EducPrueba))
[1] FALSE
> any(unlist(lapply(EducPrueba, is.nan)))
[1] FALSE
VarEduc中的变量已经居中并缩放。有谁知道如何使它工作?我在RStudio中使用R Portable 3.5.2。包插入符6.0-81和7.3-15级。我不知道如何上传数据框,因此这可以是可重复的示例。
答案 0 :(得分:0)
以下是重现错误的方法:
train(Species~.,data=iris,trctrl=trainControl(method="cv",numebr=5),
metric="Accuracy",method="knn")
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :3 NA's :3
>
错误:正在停止 另外:有50个或更多警告(请使用warnings()查看前50个警告)
以下是具有相同建议更改的相同型号:
train(Species~.,data=iris,trControl=trainControl(method="cv",number=5),
metric="Accuracy",method="knn")
k-Nearest Neighbors
150 samples
4 predictor
3 classes: 'setosa', 'versicolor', 'virginica'
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 120, 120, 120, 120, 120
Resampling results across tuning parameters:
解决您的问题
您需要将trctrl
更改为trControl
。
trainControl(method = "repeatedcv", number = 10, repeats = 3)
EducKnn <- train(as.formula("pp04b_cod ~ " %+% paste(VarEduc[!VarEduc %in% NoRel], collapse
= " + ")), EducPrueba, method = "knn", trControl= ContEnt,
tuneLength = 10)