使用CARET的火车时,最终模型没有拟合值

时间:2016-01-19 10:22:57

标签: r decision-tree r-caret

以下是代码:

ctrl <- trainControl(method="cv",number = 5, summaryFunction=twoClassSummary, classProbs=T, savePredictions = T, verboseIter = T)
    grid=expand.grid(.trials=c(1,100),.model=c("tree","rules"),.winnow=c(T,F))
    m=train(Category1 ~ ., data = tr.bal,method="C5.0", metric="ROC",trControl=ctrl, tuneGrid=grid)

我以为m $ finalModel下会有一个拟合值列。但我没有看到任何或我错过了什么。那么我如何得到最终模型的预测值。我想计算ROC therafter。

示例数据如下:

 structure(list(production_year = c(2009L, 2011L, 2011L, 2010L,  2011L, 2010L), movie_sequel = structure(c(1L, 2L, 2L, 2L, 2L,  1L), .Label = c("0", "1"), class = "factor"), creative_type = structure(c(2L,  2L, 2L, 2L, 2L, 2L), .Label = c("other", "mainstream"), class = "factor"),
    source = structure(c(3L, 1L, 1L, 3L, 1L, 1L), .Label = c("based", 
    "other", "Original Screenplay"), class = "factor"), production_method = structure(c(1L, 
    1L, 1L, 1L, 2L, 1L), .Label = c("other", "Live Action"), class = "factor"), 
    genre = structure(c(1L, 2L, 1L, 2L, 2L, 2L), .Label = c("Action", 
    "Adventure", "other", "Comedy", "Drama", "Romantic Comedy", 
    "Thriller/Suspense"), class = "factor"), language = structure(c(2L, 
    2L, 2L, 2L, 2L, 2L), .Label = c("other", "English"), class = "factor"), 
    movie_board_rating_display_name = structure(c(3L, 3L, 3L, 
    1L, 3L, 2L), .Label = c("other", "PG", "PG-13", "R"), class = "factor"), 
    movie_release_pattern_display_name = structure(c(7L, 7L, 
    7L, 7L, 7L, 7L), .Label = c("Exclusive", "Expands Wide", 
    "IMAX", "Limited", "Oscar Qualifying Run", "Special Engagement", 
    "Wide"), class = "factor"), Category1 = structure(c(2L, 2L, 
    2L, 2L, 2L, 2L), .Label = c("nothit", "hit"), class = "factor")), .Names = c("production_year",  "movie_sequel", "creative_type", "source", "production_method",  "genre", "language", "movie_board_rating_display_name", "movie_release_pattern_display_name",  "Category1"), row.names = c(NA, 6L), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

您可以使用train函数从插入符包中的类predict模型中获取拟合值。然后,您可以使用pROC::roc来创建ROC曲线。

    p = predict(m)
    curve = pROC::roc(tr.bal$Category1, as.numeric(p))
    plot(curve)

或作为可重现的例子:

    library(caret)
    data(mtcars)

    ctrl <- trainControl(method="cv",number = 5, 
         summaryFunction=twoClassSummary, classProbs=T, 
         savePredictions = T, verboseIter = T)

    grid=expand.grid(trials=c(1,100),
         model=c("tree","rules"),winnow=c(T,F))
    m=train(factor(am) ~ ., data = mtcars,
            method="C5.0",metric="ROC",
            trControl=ctrl, tuneGrid=grid)
    predict(m)
    library(pROC)
    curve = roc(response = factor(mtcars$am), 
                predictor = as.numeric(predict(m)))
    plot(curve)

令人讨厌的是,roc函数需要一个数字向量而不是一个因子as.numeric