我在运行R的逻辑回归的10倍交叉验证时遇到一些问题。
我使用了cv.glm()
函数,但是显示错误。但是,我将此功能用于ISLR包中的Smarket数据,但未显示任何错误。我的逻辑回归中的预测变量是二进制的。
# 10-Fold Cross-Validation for Logistic Regression
cv.errorlog7 <- cv.glm(p, logit7, K=10)$delta[1]
我收到以下错误消息:
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
factor gender has new levels Other
In addition: Warning messages:
1: In predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :
prediction from a rank-deficient fit may be misleading
2: In predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :
prediction from a rank-deficient fit may be misleading
3: In y - yhat :
longer object length is not a multiple of shorter object length
答案 0 :(得分:1)
我遇到了非常相似的错误:
> set.seed(100)
> cv.lm(data = catering1, form.lm = model, m=3) # 3 fold cross-validation
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
factor Month has new levels July
# Reset seed
> set.seed(1000)
> cv.lm(data = catering1, form.lm = model, m=3) # 3 fold cross-validation
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
factor Month has new levels July
如您所见,我什至重置了种子并再次尝试。没有运气。但是,当我将折叠倍数增加(直到得到响应之前,我一直增加1倍)时,代码才起作用。但是我确实得到了一个错误和警告。
> cv.lm(data = catering, form.lm = model, m=5) # 5 fold cross-validation
Response.... Anova table....
Error in which.min(xval) :
'list' object cannot be coerced to type 'double'
In addition: Warning message:
In cv.lm(data = catering, form.lm = model, m = 5) :
As there is >1 explanatory variable, cross-validation
predicted values for a fold are not a linear function
of corresponding overall predicted values. Lines that
are shown for the different folds are approximate
所以,我会尝试增加折叠次数。特别是由于您的数据集相对较小,因此不会对性能产生太大影响。