R插入错误:出了点问题;缺少所有准确度指标值:

时间:2017-05-02 08:50:36

标签: r r-caret

我试图在我的数据集上应用堆叠,但我在这里。

# Load library
library(DJL)
library(caret)
library(caretEnsemble)

# Load data and format the target attribute to avoid clutters
df <- dataset.engine.2015[, -c(1, 2)]
levels(df$Type) <- list(NA.D = "NA-D", NA.P = "NA-P", SC.P = "SC-P", TC.D = "TC-D", TC.P = "TC-P")

# Run
st.methods <- c("lda", "rpart", "glm", "knn", "svmRadial")
st.control <- trainControl(method = "repeatedcv", number = 5, repeats = 3, 
                           savePredictions = T, classProbs = T)
st.models  <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)

然后我明白了:

Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :1     NA's   :1    
Error: Stopping
In addition: There were 18 warnings (use warnings() to see them)

有人可以帮我修复此错误吗?

1 个答案:

答案 0 :(得分:1)

glm模型不能用于预测具有两个以上类别的分类因变量。尝试从st.methods删除glm或用[{1}}替换multinomgbmrandomForest

这是两个有用的实验。首先,我们只考虑glm

rm(list=ls())
library(DJL)
library(caret)
library(caretEnsemble)  
df <- dataset.engine.2015[, -c(1, 2)]
levels(df$Type) <- list(NA.D = "NA-D", NA.P = "NA-P", SC.P = "SC-P", TC.D = "TC-D", TC.P = "TC-P")

st.control <- trainControl(method = "repeatedcv", number = 5, repeats = 3, 
                           savePredictions = T, classProbs = T)

st.methods <- c("glm")
st.models  <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)

以下是错误消息:

Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :1     NA's   :1    
Error in train.default(x, y, weights = w, ...) : Stopping
Inoltre: There were 18 warnings (use warnings() to see them)

现在我们将glm替换为multinom

st.methods <- c("multinom")
st.models  <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
print(st.models)

输出结果为:

$multinom
Penalized Multinomial Regression 

1206 samples
   5 predictor
   5 classes: 'NA.D', 'NA.P', 'SC.P', 'TC.D', 'TC.P' 

No pre-processing
Resampling: Cross-Validated (5 fold, repeated 3 times) 
Summary of sample sizes: 964, 965, 965, 965, 965, 964, ... 
Resampling results across tuning parameters:

  decay  Accuracy   Kappa    
  0e+00  0.9306411  0.8518294
  1e-04  0.9300901  0.8506964
  1e-01  0.9328507  0.8564466

Accuracy was used to select the optimal model using  the largest value.
The final value used for the model was decay = 0.1.