R 错误:“预测”包含 NA Logit,我该如何解决此错误?

时间:2021-03-03 16:50:35

标签: r logistic-regression prediction

我正在执行逻辑回归,当我预测时,我的数据中出现了关于 NA 的错误。我尝试了不同的方法,但仍然出现相同的错误。这是我的代码:

Modelo_lg <- glm(Default ~ TIPO_ID + Añomes + NOMBRE_PRO + Saldo_Corte + Provisión + 
           + Calificación + Segmentación + Calif_R, data = ME, family = "binomial")
summary(Modelo_lg)

Call:
glm(formula = Default ~ TIPO_ID + Añomes + NOMBRE_PRO + Saldo_Corte + 
Provisión + +Calificación + Segmentación + Calif_R, family = "binomial", 
data = ME, na.action = na.omit)

Deviance Residuals: 
Min       1Q   Median       3Q      Max  
-2.6762  -0.0407  -0.0129  -0.0037   4.4010  

Coefficients:
                                        Estimate Std. Error z value Pr(>|z|)    
(Intercept)                           -3.352e+02  8.849e+02  -0.379    0.705    
TIPO_ID                               -1.014e-01  6.425e-02  -1.578    0.115    
Añomes                                 1.559e-03  3.295e-04   4.731 2.24e-06 ***
NOMBRE_PROGB-CORPORATIVO M/E           8.334e-01  1.870e-01   4.457 8.33e-06 ***
NOMBRE_PROGB-PRESTAMOS REDES SIN GTIA  1.947e+00  1.293e-01  15.066  < 2e-16 ***
Saldo_Corte                            6.447e-12  1.385e-11   0.465    0.642    
Provisión                             -1.478e-11  2.201e-11  -0.671    0.502    
CalificaciónB                          2.992e+00  1.753e-01  17.070  < 2e-16 ***
CalificaciónC                          6.624e+00  1.428e-01  46.395  < 2e-16 ***
CalificaciónD                          8.702e+00  1.586e-01  54.865  < 2e-16 ***
CalificaciónE                          1.003e+01  2.210e-01  45.368  < 2e-16 ***
SegmentaciónColombia_Corp             -3.160e+00  3.301e-01  -9.575  < 2e-16 ***
SegmentaciónColombia_Emp              -5.245e+00  3.562e-01 -14.723  < 2e-16 ***
SegmentaciónColombia_Miami            -1.603e+01  1.030e+03  -0.016    0.988    
SegmentaciónColombia_Pyme             -2.481e+00  3.298e-01  -7.524 5.31e-14 ***
Calif_RR10                             1.338e+01  8.824e+02   0.015    0.988    
Calif_RR2                              4.730e-01  1.012e+03   0.000    1.000    
Calif_RR3                              1.236e+01  8.824e+02   0.014    0.989    
Calif_RR4                              4.001e-01  9.229e+02   0.000    1.000    
Calif_RR5                              1.426e+01  8.824e+02   0.016    0.987    
Calif_RR6                              1.526e+01  8.824e+02   0.017    0.986    
Calif_RR7                              1.731e+01  8.824e+02   0.020    0.984    
Calif_RR8                              1.684e+01  8.824e+02   0.019    0.985    
Calif_RR9                              1.608e+01  8.824e+02   0.018    0.985    
Calif_RSin R                           1.470e+01  8.824e+02   0.017    0.987    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Null deviance: 25839.9  on 259715  degrees of freedom
Residual deviance:  5832.4  on 259691  degrees of freedom
(4 observations deleted due to missingness)
AIC: 5882.4

Number of Fisher Scoring iterations: 21

####Dividing the sample####

n<- dim(ME)[1]

set.seed(1234) # random sample
train <- sample(1:n , 0.7*n)

ME.test <- ME[-train,]
ME.train <- ME[train,]

ytrain <- ME$Default[train]
ytest <- ME$Default[-train]

###Predict

pred1<- predict.glm(Modelo_lg, newdata = ME.test, type="response")
result1<- table(ytest, floor(pred1+0.5))
result1


ytest     0     1
0 77131    99
1   161   524


error1<- sum(result1[1,2], result1[2,1])/sum(result1)
error1

ytest     0     1
0 77131    99
1   161   524


library(ROCR)

pred = ROCR::prediction(pred1,ytest)
perf <- performance(pred, "tpr", "fpr")

错误:“预测”包含不适用。

我已经尝试在我的 glm 模型和 predict.glm 中放置:na.action = na.exclude(此处建议 How to Use `predict()` without errors in a model when you have missing data?)。如果我把它放在 predict.glm 中,那么我会得到另一个错误:所有参数必须具有相同的长度

希望您能指导我,谢谢!

0 个答案:

没有答案