我在使用以下代码时遇到了一些问题:
model4 = glm(data = data16, Loan_Status_Coded ~ Coapplicant_Income_Modified +
Dependents_SelfEmployed_1 + Dependents_Imputed_0_Dummy +
Dependents_Imputed_1_Dummy + Dependents_Imputed_2_Dummy+
Self_Employed_Imputed_Coded + Credit_History_Married + Married_Imputed_Coded +
sqrt_LoanAmount_Imputed + Loan_Amount_Term_Imputed_Low_Dummy +
Loan_Amount_Term_Imputed_Medium_Dummy + Credit_History_Imputed +
Education_Coded + Property_Area_Semiurban_Dummy + Property_Area_Rural_Dummy,
family = binomial(link = "logit"))
summary(model4)
predict5 = predict(data = data16, model4, type = "response")
table(data16$Loan_Status_Coded, predict5>0.5)
运行table
函数会出现以下错误:
“所有参数必须具有相同的长度”
似乎predict5中的行数小于data16中的行数。
如果我使用predict5 = predict(newdata = data16,model4,type =“response”),则不会发生错误,但数据点的数量会减少。例如,使用newdata的输出是:
FALSE TRUE
0 40 39
1 7 176
但data16
有614行。
我在这里做错了什么?
答案 0 :(得分:1)
这里的罪魁祸首是" NA" data16中某个变量的值。在处理" NA"之后,它正常工作。值。