我试图在我的数据集上应用堆叠,但我在这里。
# Load library
library(DJL)
library(caret)
library(caretEnsemble)
# Load data and format the target attribute to avoid clutters
df <- dataset.engine.2015[, -c(1, 2)]
levels(df$Type) <- list(NA.D = "NA-D", NA.P = "NA-P", SC.P = "SC-P", TC.D = "TC-D", TC.P = "TC-P")
# Run
st.methods <- c("lda", "rpart", "glm", "knn", "svmRadial")
st.control <- trainControl(method = "repeatedcv", number = 5, repeats = 3,
savePredictions = T, classProbs = T)
st.models <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
然后我明白了:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error: Stopping
In addition: There were 18 warnings (use warnings() to see them)
有人可以帮我修复此错误吗?
答案 0 :(得分:1)
glm
模型不能用于预测具有两个以上类别的分类因变量。尝试从st.methods
删除glm
或用[{1}}替换multinom
,gbm
,randomForest
。
这是两个有用的实验。首先,我们只考虑glm
:
rm(list=ls())
library(DJL)
library(caret)
library(caretEnsemble)
df <- dataset.engine.2015[, -c(1, 2)]
levels(df$Type) <- list(NA.D = "NA-D", NA.P = "NA-P", SC.P = "SC-P", TC.D = "TC-D", TC.P = "TC-P")
st.control <- trainControl(method = "repeatedcv", number = 5, repeats = 3,
savePredictions = T, classProbs = T)
st.methods <- c("glm")
st.models <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
以下是错误消息:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error in train.default(x, y, weights = w, ...) : Stopping
Inoltre: There were 18 warnings (use warnings() to see them)
现在我们将glm
替换为multinom
:
st.methods <- c("multinom")
st.models <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
print(st.models)
输出结果为:
$multinom
Penalized Multinomial Regression
1206 samples
5 predictor
5 classes: 'NA.D', 'NA.P', 'SC.P', 'TC.D', 'TC.P'
No pre-processing
Resampling: Cross-Validated (5 fold, repeated 3 times)
Summary of sample sizes: 964, 965, 965, 965, 965, 964, ...
Resampling results across tuning parameters:
decay Accuracy Kappa
0e+00 0.9306411 0.8518294
1e-04 0.9300901 0.8506964
1e-01 0.9328507 0.8564466
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was decay = 0.1.