在我的整个功能中,我有z
和y
我希望z
等于数据集(例如birthwt
)和y
等于响应变量(例如birthwt$low
)
library("MASS")
library("dplyr")
data(birthwt)
foo=function(z,y){
n.folds <- 10
folds <- cut(sample(seq_len(nrow(z))), breaks=n.folds, labels=FALSE)
all.confusion.tables <- list()
for (i in seq_len(n.folds)) {
train <- filter(z, folds != i)
test <- filter(z, folds == i)
glm.train <- glm(y ~.,family = binomial, data = train)
mod_pred_probs =predict(glm.train,test, type= "response")
pred.class <- ifelse(mod_pred_probs< 0, 0, 1)
all.confusion.tables[[i]] <- table(pred = pred.class, true = test$y)
}
misclassrisk <- function(x) { (sum(x) - sum(diag(x)))/sum(x) }
risk <- sapply(all.confusion.tables, misclassrisk)
return(table(risk))
mean(risk)}
当我跑foo(出生时,“低”)
我收到错误:
Error in model.frame.default(formula = y ~ ., data = train, drop.unused.levels = TRUE) :
variable lengths differ (found for 'low')
有人知道我为什么会收到错误或者我怎么能避免错误?