我正在R中运行概率回归。该模型混合了一些连续和分类变量(编码为因子)。我想计算每个变量的边际效应。为此,我使用margins包中的命令 margins ,此命令返回AME并识别因子并显示它们每个级别的边缘效应。因此,计算边际效应时如何处理分类变量?如果连续变量保持其平均值(默认情况下),这些分类变量如何固定?
我希望这个问题很清楚,这是一个理论问题。
答案 0 :(得分:0)
这是奥利维尔(Olivier)提出的可复制的例子。
library(stats)
library(margins)
# Generate Data
set.seed(1234)
# Continuous
y<-rnorm(1000)
x1<-4*runif(1000)
x2<-2*rnorm(1000)
# Categorical
c1<-as.factor(ifelse(x1<=0.9,"A",
ifelse(x1>0.9 & x1<=2.4,"B",
ifelse(x1>2.4 & x1<=3.5,"C","D"))))
c2<-as.factor(ifelse(x2>2,"Y","N"))
table(c1)
> c1
> A B C D
> 201 397 268 134
table(c2)
> c2
> N Y
> 825 175
# Dummy dependent variable
y<-ifelse(y>0,1,0)
table(y)
> y
> 0 1
> 517 483
probit<-glm(y ~ x1 + x2 + c1 + c2,family=binomial(link="probit"))
# AME
margins<-summary(margins(probit))
margins[,c(1,2)]
> factor AME
> c1B 0.0068
> c1C 0.0620
> c1D 0.0800
> c2Y -0.0176
> x1 -0.0371
> x2 0.0037
# COMPUTE PARTIAL DENSITY FUNCTION
pdf<-mean(dnorm(predict(probit)))
# Compute Manually AME of x1 (Continous Variable)
round(pdf*coef(probit)[2],4) # This is the same AME returned by the margins command !
> x1
> -0.0371
# 1. Compute Manually AME of C1 and C2 for each level (Categorical Variable) using pdf
round(pdf*coef(probit)[4],4) # AME_C1 Level B
> c1B
> 0.0069
round(pdf*coef(probit)[5],4) # AME_C1 Level C
> c1C
> 0.0623
round(pdf*coef(probit)[6],4) # AME_C1 Level D
> c1D
> 0.0804
# They all are slightly different to those returned my margins command
round(pdf*coef(probit)[7],4) # AME_C2 (dummy)
> c2Y
> -0.0176
因此,我的问题是:“ margins命令是否通过对虚拟数据和分类数据使用离散更改来计算边缘效应?”
理论上,如果变量Xj是连续的,则边际概率效应为: [https://i.stack.imgur.com/OtEEx.jpg]
但是,如果Xj是虚拟对象: