for循环正在运行,但效果不佳

时间:2019-11-04 03:15:15

标签: r

我正在使用R进行ML项目,我已经准备好数据集并将数据分成10个相等的分割,但是问题是我需要手动拟合模型10次(10倍CV)。我试图使用for循环创建训练和测试数据,但是每次运行时,训练都是整个数据集,并且测试为null。有谁可以帮助我吗?

# Preparing the data

data <- read.csv("./project.csv")

id <- seq(1:103342)

data[, 'id'] <- id

for (i in 3:8) {
  data[,i] <- as.factor(data[,i])
}



# splitting the data into 10 equal data frames

f <- rep(seq(1, 10), each=round(103342/10), length.out=103342)

df <- split(data, f)

lapply(df, dim)


# running 10-fold cross-validation and computing error rate and AUC for each run.

results <- matrix(nrow=10, ncol=2, dimnames= list(c(), c('error_rate', 'auc')))

for (i in 1:10) {
  train <- data[!(data$id %in% df$`i`$id),]
  test <- df$`i`
  print(dim(test)) # Here is my problem the print statement will print null 10 times
  glm.fit <- glm(canceled ~ ., data=train, family=binomial)
  glm.prob <- predict(glm.fit, newdata=test, type="response")
  ...
}

0 个答案:

没有答案