我正在尝试使用以下代码进行逻辑回归:
model <- glm (Participation ~ Gender + Race + Ethnicity + Education + Comorbidities + WLProgram + LoseWeight + EverLoseWeight + PastYearLW + Age + BMI, data = LogisticData, family = binomial)
摘要(模型)
我不断收到错误消息:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels
在检查论坛时,我检查了哪些变量是因素:
str(LogisticData)
'data.frame': 994 obs. of 13 variables:
$ outcome : Factor w/ 2 levels "No","Yes": 1 1 2 2 1 2 2 1 2 2 ...
$ Gender : Factor w/ 3 levels "Male","Female",..: 1 2 2 1 2 1 1 1 1
$ Race : Factor w/ 3 levels "White","Black",..: 1 1 1 3 1 1 1 1 1 1
$ Ethnicity : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2 2
$ Education : Factor w/ 2 levels "Below Bachelors",..: 1 1 1 2 1 1 1 2 1
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 2 2 1 1 ...
$ WLProgram : Factor w/ 2 levels "No","Yes": NA 1 2 2 1 1 1 NA 1 1 ...
$ LoseWeight : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ PastYearLW : Factor w/ 2 levels "Yes","No": NA 2 1 1 1 2 1 NA 1 1 ...
$ EverLoseWeight: Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ Age : int 29 35 69 32 21 45 40 62 59 58 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 2 1 1 1 1 1 2 1 2 ...
$ BMI : num 25.7 33.8 26.4 32.3 27.5 ...
所有因素似乎都有2个或更多的水平。
我还尝试省略了NA,但仍然给我这个错误。
我想要回归中的所有变量,但无法弄清楚为什么它不运行。
执行时:
newdata <- droplevels(na.omit(LogisticData))
> str(newdata)
'data.frame': 840 obs. of 13 variables:
$ outcome : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 2 2 ...
$ Gender : Factor w/ 3 levels "Male","Female",..: 2 2 1 2 1 1 1 2 1
$ Race : Factor w/ 3 levels "White","Black",..: 1 1 3 1 1 1 1 1 3
$ Ethnicity : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2
$ Education : Factor w/ 2 levels "Below Bachelors",..: 1 1 2 1 1 1 1 1
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 2 1 1 1 2 ...
$ WLProgram : Factor w/ 2 levels "No","Yes": 1 2 2 1 1 1 1 1 1 1 ...
$ LoseWeight : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ PastYearLW : Factor w/ 2 levels "Yes","No": 2 1 1 1 2 1 1 1 1 2 ...
$ EverLoseWeight: Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ Age : int 35 69 32 21 45 40 59 58 23 32 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 2 1 ...
$ BMI : num 33.8 26.4 32.3 27.5 45.4 ...
- attr(*, "na.action")=Class 'omit' Named int [1:154] 1 8 13 14 21 24 25
46 55 58 ...
.. ..- attr(*, "names")= chr [1:154] "1" "8" "13" "14" ...
这对我来说没有意义,因为您可以在第一个str(逻辑数据)中看到EverLoseWeight中显然有2个级别,因为您可以看到Yes和No以及1和2?我该如何解决这个异常?
答案 0 :(得分:0)
尝试对原始数据进行summary
,并确保所有级别都具有值。我会对此发表评论,但我没有信誉点:(
答案 1 :(得分:0)
鉴于您的更新,看来您至少有两种可能性。
1:在删除NA后,删除仅剩一个级别的因子(即LoseWeight
和EverLoseWeight
)。
2:将NA视为额外级别。类似于
a = as.factor(c(1,1,NA,2))
b = as.factor(c(1,1,2,1))
# 0 is an unused factor level for a
x = data.frame(a, b)
levels(x$a) = c(levels(x$a), 0)
x$a[is.na(x$a)] = 0
但这可能无法处理任何导致单级因素的奇异性问题。