我有一个临床数据集,由受试者ID作为行,而不同变量作为列。我想建立一个预测模型,并将我的数据适当地分为测试和训练数据。我建立了一个逻辑回归模型,但是由于某种原因,拟合的摘要输出向我显示了主题ID,而不是列/变量是系数。
这是数据集的样子:
subjectkey sex height weight interview_age flanker_score cardsort_score intbehaviour_score
NDAR_INV09AUXBBT M 59.00000 104.00000 118 107 109 GOOD
NDAR_INV0BVP2PTD F 50.25000 60.00000 120 92 103 GOOD
NDAR_INV0CV2Y4YR M 55.30000 97.00000 120 83 94 BAD
NDAR_INV0X45NBYM M 63.50000 104.50000 128 101 103 BAD
这是我用来拟合模型的代码:
data.train.glm <- glm(intbehaviour_score~., data = data.train, family = binomial)
#summary of fit
summary(data.train.glm)
这是我得到的输出:
Call:
glm(formula = intbehaviour_score ~ ., family = binomial, data = data.train)
Deviance Residuals:
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[34] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[67] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[100] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[133] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[166] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[199] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[232] 0 0 0 0
Coefficients: (11 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.657e+01 3.561e+05 0 1
subjectkeyNDAR_INV0BVP2PTD -5.916e-13 5.036e+05 0 1
subjectkeyNDAR_INV0CV2Y4YR 5.313e+01 5.036e+05 0 1
subjectkeyNDAR_INV0X45NBYM 5.313e+01 5.036e+05 0 1
subjectkeyNDAR_INV10EP1VM2 -6.084e-13 5.036e+05 0 1
我不明白为什么主题ID作为系数而不是变量出现。