我正在使用R中的brms
包来学习贝叶斯回归建模。
我正在模拟邻近地区人均人口的感染率,并调查与其他社区层面协变量的关联,例如贫困水平和到健康中心的距离。
例如:
library(tidyverse)
library(brms)
set.seed(1234)
df <- tibble(neighbourhood = seq(1:200),
cases = rpois(n = 200, lambda = 3),
population = round(runif(n = 200, min = 100, max = 10000)),
poverty = round(runif(n = 200, min = 30, max = 90)),
distance = runif(n = 200, min = 20, max = 10000))
使用这个组成的数据集,我可以构建贝叶斯回归模型(我的真实模型更复杂,具有空间自相关项和其他协变量)。
请注意offset(log(population))
字词,以便调整邻近人口规模。
prior <- c(prior_string("normal(0,10)", class="b"),
prior_(~normal(0,10), class= ~Intercept))
m1 <- brm(bf(cases ~
poverty +
log(distance) +
offset(log(population))),
data=df,
family='poisson',
prior = prior,
iter=4000, warmup=1000,
chains=3,
seed=1234,
control=list(adapt_delta=0.95))
summary(m1)
Family: poisson
Links: mu = log
Formula: cases ~ poverty + log(distance) + offset(log(population))
Data: df (Number of observations: 200)
Samples: 3 chains, each with iter = 4000; warmup = 1000; thin = 1;
total post-warmup samples = 9000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept -7.40 0.35 -8.08 -6.71 6591 1.00
poverty 0.01 0.00 0.00 0.01 9000 1.00
logdistance -0.04 0.04 -0.12 0.03 5936 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
我可以通过运行
来绘制poverty
和log(distance)
变量的边际效应
marginal_effects(m1)
据我所知,边际效应图是根据模型中其他协变量的平均值估算的。
然而,我真正感兴趣的是根据人口规模,平均距离和贫困程度绘制案例数量。
更好的是每个人口的感染率是远距离和贫困的函数。
不确定我要找的是a)是否合理,或b)是否可以brms
...但非常感谢任何建议。