Question

我正在使用R来执行一些统计分析，我在解释模型时遇到了一些问题。请考虑以下数据：

df <- data.frame( richness=c(9,13,10,12,11,5,6,8,9,10,10,8, 
                 5,7,6,9,5,6,7,8,4,10,5,8, 
                 4,5,7,5,6,7,4,5,5,6,6,6, 
                 1,0,2,1,4,5,3,2,0,1,4,4),
        condition=c(rep("A",24), rep("B",24)), 
        area = c(12.62, 11.07, 15.72, 15.41, 6.42, 19.13, 17.58, 19.44, 13.55, 18.20, 6.73, 14.79, 5.80, 14.48, 17.89,  7.66, 10.76,  8.90, 8.59, 12.00, 12.93,  7.04, 17.27, 16.34,  9.83,  9.52, 19.75, 10.14, 13.86, 12.31, 16.03, 11.38, 14.17, 15.10, 18.51,  9.21, 20.06, 20.37,  7.97,  7.35,  8.28, 16.65,  6.11, 18.82, 10.45, 16.96, 11.69, 13.24),
        treatment=rep(c(rep("absent",12), rep("present",12)), 2)) 

library(MASS)
nb.fit <- glm.nb( richness ~ condition * treatment, data=df)
exp(coef(nb.fit))

通过做exp（coef（nb.fit）），我们可以计算每个组合的平均值。为简单起见，我们只使用截距; 所以，A：缺席的平均丰富度是9.25种。可以通过以下方式获得（验证）：

mean(subset(df, condition=="A" & treatment=="absent")$richness)

现在，考虑到我们有不同的采样大小，这意味着我们有不同的努力。我们应该（可能必须）解释这个问题。所以，让我们把它作为模型中的偏移量。

nb.fit2 <- glm.nb(richness ~ condition * treatment + offset(log(area)),data=df)

现在看一下模型系数：

exp(coef(nb.fit2))

问题1：解释平均丰富度/面积为0.69种A：缺席是否正确？（在这种情况下，每平方米0.69种）

问题2：如何计算偏移量？例如，我无法像第一种情况那样手动获得平均值 mean（subset（df，condition ==“A”＆amp; treatment ==“absent”）$ richness）即使试图用

去除该区域的影响

df$richness.area <- df$richness/df$area
mean(subset(df, condition=="A" & treatment=="absent")$richness.area)

A的平均值：缺席不一样。我还尝试使用区域的残差去除效果，并且值再次不相同。

有人知道所涉及的数学吗？

由于

理解偏移量为

0 个答案: