在泊松GLM R中改变Y截距

时间:2018-03-30 20:19:43

标签: r output regression intercept coefficients

背景:我有以下数据运行glm功能:

location = c("DH", "Bos", "Beth")
count = c(166, 57, 38)

#make into df
df = data.frame(location, count) 

#poisson
summary(glm(count ~ location, family=poisson))

输出:

Coefficients:

            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   3.6376     0.1622  22.424  < 2e-16 ***
locationBos   0.4055     0.2094   1.936   0.0529 .  
locationDH    1.4744     0.1798   8.199 2.43e-16 ***

问题:我想更改(Intercept),以便我可以获得与Bos相关的所有值

我看了Change reference group using glm with binomial familyHow to force R to use a specified factor level as reference in a regression?。我尝试了那种方法,它没有用,我不知道为什么。

尝试:

df1 <- within(df, location <- relevel(location, ref = 1))

#poisson
summary(glm(count ~ location, family=poisson, data = df1))

期望输出:

Coefficients:

            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    ...
locationBeth   ...
locationDH     ...

问题:如何解决此问题?

1 个答案:

答案 0 :(得分:3)

我认为您的问题是您正在修改数据框,但在您的模型中您没有使用数据框。使用模型中的data参数来使用数据框中的数据。

location = c("DH", "Bos", "Beth")
count = c(166, 57, 38)
# make into df
df = data.frame(location, count) 

请注意location本身就是character向量。 data.frame()默认情况下会将其强制转换为数据框中的factor。转换完成后,我们可以使用relevel来指定参考级别。

df$location = relevel(df$location, ref = "Bos") # set Bos as reference
summary(glm(count ~ location, family=poisson, data = df))

    # Call:
    # glm(formula = count ~ location, family = poisson, data = df)
    # ...
    # Coefficients:
    #              Estimate Std. Error z value Pr(>|z|)    
    # (Intercept)    4.0431     0.1325  30.524  < 2e-16 ***
    # locationBeth  -0.4055     0.2094  -1.936   0.0529 .  
    # locationDH     1.0689     0.1535   6.963 3.33e-12 ***
    # ...