以下是代码:
ctrl <- trainControl(method="cv",number = 5, summaryFunction=twoClassSummary, classProbs=T, savePredictions = T, verboseIter = T)
grid=expand.grid(.trials=c(1,100),.model=c("tree","rules"),.winnow=c(T,F))
m=train(Category1 ~ ., data = tr.bal,method="C5.0", metric="ROC",trControl=ctrl, tuneGrid=grid)
我以为m $ finalModel下会有一个拟合值列。但我没有看到任何或我错过了什么。那么我如何得到最终模型的预测值。我想计算ROC therafter。
示例数据如下:
structure(list(production_year = c(2009L, 2011L, 2011L, 2010L, 2011L, 2010L), movie_sequel = structure(c(1L, 2L, 2L, 2L, 2L, 1L), .Label = c("0", "1"), class = "factor"), creative_type = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("other", "mainstream"), class = "factor"),
source = structure(c(3L, 1L, 1L, 3L, 1L, 1L), .Label = c("based",
"other", "Original Screenplay"), class = "factor"), production_method = structure(c(1L,
1L, 1L, 1L, 2L, 1L), .Label = c("other", "Live Action"), class = "factor"),
genre = structure(c(1L, 2L, 1L, 2L, 2L, 2L), .Label = c("Action",
"Adventure", "other", "Comedy", "Drama", "Romantic Comedy",
"Thriller/Suspense"), class = "factor"), language = structure(c(2L,
2L, 2L, 2L, 2L, 2L), .Label = c("other", "English"), class = "factor"),
movie_board_rating_display_name = structure(c(3L, 3L, 3L,
1L, 3L, 2L), .Label = c("other", "PG", "PG-13", "R"), class = "factor"),
movie_release_pattern_display_name = structure(c(7L, 7L,
7L, 7L, 7L, 7L), .Label = c("Exclusive", "Expands Wide",
"IMAX", "Limited", "Oscar Qualifying Run", "Special Engagement",
"Wide"), class = "factor"), Category1 = structure(c(2L, 2L,
2L, 2L, 2L, 2L), .Label = c("nothit", "hit"), class = "factor")), .Names = c("production_year", "movie_sequel", "creative_type", "source", "production_method", "genre", "language", "movie_board_rating_display_name", "movie_release_pattern_display_name", "Category1"), row.names = c(NA, 6L), class = "data.frame")
答案 0 :(得分:0)
您可以使用train
函数从插入符包中的类predict
模型中获取拟合值。然后,您可以使用pROC::roc
来创建ROC曲线。
p = predict(m)
curve = pROC::roc(tr.bal$Category1, as.numeric(p))
plot(curve)
或作为可重现的例子:
library(caret)
data(mtcars)
ctrl <- trainControl(method="cv",number = 5,
summaryFunction=twoClassSummary, classProbs=T,
savePredictions = T, verboseIter = T)
grid=expand.grid(trials=c(1,100),
model=c("tree","rules"),winnow=c(T,F))
m=train(factor(am) ~ ., data = mtcars,
method="C5.0",metric="ROC",
trControl=ctrl, tuneGrid=grid)
predict(m)
library(pROC)
curve = roc(response = factor(mtcars$am),
predictor = as.numeric(predict(m)))
plot(curve)
令人讨厌的是,roc函数需要一个数字向量而不是一个因子as.numeric
。