手动LOOCV与cv.glm

时间:2017-08-08 11:28:47

标签: r logistic-regression cross-validation

Introduction to Statistical Learning中,我们要求手动执行Leave Out One Cross Validation over logistic回归。它的代码在这里:

count = rep(0, dim(Weekly)[1])
for (i in 1:(dim(Weekly)[1])) {
##fitting a logistic regression model, not including ith data in the training data
    glm.fit = glm(Direction ~ Lag1 + Lag2, data = Weekly[-i, ], family = binomial)

    is_up = predict.glm(glm.fit, Weekly[i, ], type = "response") > 0.5

    is_true_up = Weekly[i, ]$Direction == "Up"
    if (is_up != is_true_up) 
        count[i] = 1
}
sum(count)
##[1] 490

可以找到此代码的来源here

这意味着错误率约为45%。 但是当我们这样做时,使用cv.glm()库的boot函数,结果会大不相同。

> library(boot)
> glm.fit = glm(Direction~Lag1+Lag2,data=Weekly,family=binomial)
> cv.glm = cv.glm(Weekly,glm.fit)
> cv.glm$delta
[1] 0.2464536 0.2464530

为什么会这样? cv.glm()函数究竟做了什么?

0 个答案:

没有答案