我正在处理使用概率权重和多次估算的调查数据。在使用推算数据集和调查权重估算logit模型后,我希望得到边际效应。我无法弄清楚如何在R中做到这一点.Stata有包mimrgns,这使得它很容易。还有article (pdf)和supplementary material (pdf)给出了一些方向,但我似乎无法将其应用于我的情况。
在下面的例子中,请假设我已经在三个数据集(即df1,df2和df3)中估算了“收入”。我想用就业状况(即工作)和“收入”来预测“性别”。
这是一个可重复的例子。
library(tibble)
library(survey)
library(mitools)
library(ggeffects)
# Data set 1
# Note that I am excluding the "income" variable from the "df"s and creating
# it separately so that it varies between the data sets. This simulates the
# variation with multiple imputation. Since I am using the same seed
# (i.e., 123), all the other variables will be the same, the only one that
# will vary will be "income."
set.seed(123)
df1 <- tibble(id = seq(1, 100, by = 1),
gender = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
pweight = sample(50:500, 100, replace = TRUE))
# Create random income variable.
set.seed(456)
income <- tibble(income = sample(0:100000, 100))
# Bind it to df1
df1 <- cbind(df1, income)
# Data set 2
set.seed(123)
df2 <- tibble(id = seq(1, 100, by = 1),
gender = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
pweight = sample(50:500, 100, replace = TRUE))
set.seed(789)
income <- tibble(income = sample(0:100000, 100))
df2 <- cbind(df2, income)
# Data set 3
set.seed(123)
df3 <- tibble(id = seq(1, 100, by = 1),
gender = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
pweight = sample(50:500, 100, replace = TRUE))
set.seed(101)
income <- tibble(income = sample(0:100000, 100))
df3 <- cbind(df3, income)
# Apply weights via svydesign
imputation <- svydesign(id = ~id,
weights = ~pweight,
data = imputationList(list(df1,
df2,
df3)))
# Logit model with weights and imputations
logitImp <- with(imputation, svyglm(gender ~ working + income,
family = binomial()))
# Combine results across MI datasets
summary(MIcombine(logitImp))
通常我会使用library(ggeffects)
来获得边际效果,但是当我尝试使用估算数据Error in class(model) <- "lmerMod" : attempt to set an attribute on NULL
时,我会收到以下错误。这是一个如何在没有插补的情况下使用“df1”作为数据集的例子。
# Create new svydesign variable
noImp <- svydesign(id = ~id,
weights = ~pweight,
data = df1)
# Run model
logit <- svyglm(gender ~ working + income,
family = binomial,
design = noImp,
data = df1)
# Get marginal effects at the mean
ggpredict(logit, term = "working")
任何想法如何通过多重插补来做到这一点?