首先让我介绍一个示例数据。
set.seed(1)
x1=rnorm(10)
y=as.factor(sample(c(1,0),10,replace=TRUE))
x2=sample(c('Young','Middle','Old'),10,replace=TRUE)
model1 <- glm(y~as.factor(x1>=0)+as.factor(x2),binomial)
当我输入summary(model1)
时,我得到了
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1835 1.0926 -0.168 0.867
as.factor(x1 >= 0)TRUE 0.7470 1.7287 0.432 0.666
as.factor(x2)Old 0.7470 1.7287 0.432 0.666
as.factor(x2)Young 18.0026 4612.2023 0.004 0.997
现在请忽略模型估算,因为数据是假的
R中有没有办法改变最左边一列出现的估算名称,使它们看起来更清晰?例如。删除as.factor,并在因子级别之前放置_
。输出应如下:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1835 1.0926 -0.168 0.867
(x1 >= 0)_TRUE 0.7470 1.7287 0.432 0.666
(x2)_Old 0.7470 1.7287 0.432 0.666
(x2)_Young 18.0026 4612.2023 0.004 0.997
答案 0 :(得分:5)
除上述注释外,另一部分是将所有数据放在数据框中,并相应地命名变量。然后变量名称不是从一个塞满你公式的丑陋表达中获取的:
library(car)
dat <- data.frame(y = y,
x1 = cut(x1,breaks = c(-Inf,0,Inf),labels = c("x1 < 0","x1 >= 0"),right = FALSE),
x2 = as.factor(x2))
#To illustrate Brian's suggestion above
options(decorate.contr.Treatment = "")
model1 <- glm(y~x1+x2,binomial,data = dat,
contrasts = list(x1 = "contr.Treatment",x2 = "contr.Treatment"))
summary(model1)
Call:
glm(formula = y ~ x1 + x2, family = binomial, data = dat, contrasts = list(x1 = "contr.Treatment",
x2 = "contr.Treatment"))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7602 -0.8254 0.3456 0.8848 1.2563
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1835 1.0926 -0.168 0.867
x1[x1 >= 0] 0.7470 1.7287 0.432 0.666
x2[Old] 0.7470 1.7287 0.432 0.666
x2[Young] 18.0026 4612.2023 0.004 0.997
答案 1 :(得分:1)
对于第一部分,在适合模型之前首先获取数据。收集数据框中的变量,并在该数据框中包含已处理的变量,这样您就可以控制它们的名称。例如:
set.seed(1)
x1 <- rnorm(10)
y <- as.factor(sample(c(1,0), 10, replace=TRUE))
x2 <- sample(c('Young', 'Middle', 'Old'), 10, replace=TRUE)
dat <- data.frame(y = y, x1 = x1, x2 = factor(x2),
x1.gt.0 = factor(x1 >= 0))
model1 <- glm(y~ x1.gt.0 + x2, data = dat, family = binomial)
> coef(model1)
(Intercept) x1.gt.0TRUE x2Old x2Young
-0.1835144 0.7469661 0.7469661 18.0026168
这就是你应该如何在大多数R函数中使用公式接口。