如何解释嵌套ERGM模型的系数?

时间:2017-04-30 12:48:49

标签: social-networking

我试图解释我在友谊网络中观察到的同性恋效果,通过其他同性恋效果,我想知道嵌套模型是否可以做到这一点。

这是事情。我正在建立一个中学生的友谊网络。通过'nodematch'ergm术语,很明显具有相同社会背景(父母的社会经济地位,实际上)的学生形成平局的可能性更高。这可以部分解释为,在他们的城镇,来自同一社会背景的学生通常彼此更接近。所以我在我的模型中添加了第二个“nodematch”术语,它计算了两个学生来自同一邻域的边数。它确实很重要('anova.ergm'功能也证实了第二个模型比第一个模型更好);在第二个模型中,社会同性恋参数的系数仍然显着,但小于模型1 。我能否将此解释为空间接近度“解释”社会同性恋效应的一部分(就像嵌套线性回归一样)?或两个模型中的系数不可比较?

这是一个来自statnet的sampson数据的简短示例,它看起来很像我自己的情况:

# Load statnet and the data :
library(statnet)
library(stargazer)
data('sampson')

# Estimate 2 models : one only with homophily on 'cloisterville', and one with both 'cloisterville' and 'group' homophily.
m1 <- ergm(samplike ~ edges + nodematch('cloisterville'))
m2 <- ergm(samplike ~ edges + nodematch('cloisterville') + nodematch('group'))

# The second model is a better fit than the first one :
anova.ergm(m1,m2)

# Look at the models :
stargazer(m1,m2,type="text")
# The log-odd of nodematch.cloisterville considerably fell down, from 1.585 to 0.586 !
# That's because most edges matching on cloisterville also match on groups.
# However, is it okay to consider that group homophily explains about two thirds of cloisterville homophily ? [(1.585 - 0.586)/1.585 = 0.63]
# Is there any way to assess the significance of this fall in the cloisterville coefficient ?

非常感谢你的帮助!

Timothée

1 个答案:

答案 0 :(得分:0)

是的,解释是一样的。特别是在您提供的情况下,由于它没有初始化MCMC(不包含依赖于二元的术语),因此只是一个逻辑回归。如果您有依赖于二元的术语,ergm的点估计值仍然来自glmergm魔法与标准错误的估算密切相关。

可以使用glm来估算模型:

library(statnet)
library(stargazer)
data('sampson')

y <- gvectorize(as.matrix(samplike),  censor.as.na= T)
x1 <- 1*outer(samplike %v% "cloisterville",samplike %v% "cloisterville",FUN="==")
x1 <- gvectorize(x1)
x2 <- 1*outer(samplike %v% "group",samplike %v% "group",FUN="==")
x2 <- gvectorize(x2)
glm1 <- glm(y ~ 1 + x1, family = "binomial")
names(glm1$coefficients) <- c("edges", "nodematch.cloisterville")
glm2 <- glm(y ~ 1 + x1 + x2, family = "binomial")
names(glm2$coefficients) <- c("edges", "nodematch.cloisterville", "nodematch.group")

stargazer(m1,glm1,m2,glm2,type="text")

=================================================================================
                                           Dependent variable:                   
                        ---------------------------------------------------------
                             samplike          y          samplike          y    
                        exponential family logistic  exponential family logistic 
                           random graph                 random graph             
                               (1)            (2)           (3)            (4)   
---------------------------------------------------------------------------------
edges                       -0.662***      -0.662***     -1.768***      -1.768***
                             (0.176)        (0.176)       (0.256)        (0.256) 

nodematch.cloisterville      -0.487*        -0.487*        -0.460        -0.460  
                             (0.254)        (0.254)       (0.304)        (0.304) 

nodematch.group                                           2.643***      2.643*** 
                                                          (0.304)        (0.304) 

---------------------------------------------------------------------------------
Observations                                  306                          306   
Log Likelihood                             -181.749                     -137.282 
Akaike Inf. Crit.            367.497        367.497       280.565        280.565 
Bayesian Inf. Crit.          374.944                      291.736                
=================================================================================
Note:                                                 *p<0.1; **p<0.05; ***p<0.01

在您的原始示例中,您可以将邻域的同音效果解释为SES的同音效果的净值。这似乎说:SES很重要,邻居更重要。我想象这两个变量的相关性非常高。