Question

我正在R中运行概率回归。该模型混合了一些连续和分类变量（编码为因子）。我想计算每个变量的边际效应。为此，我使用margins包中的命令 margins ，此命令返回AME并识别因子并显示它们每个级别的边缘效应。因此，计算边际效应时如何处理分类变量？如果连续变量保持其平均值（默认情况下），这些分类变量如何固定？

我希望这个问题很清楚，这是一个理论问题。

Answer 1

这是奥利维尔（Olivier）提出的可复制的例子。

library(stats)
library(margins)

# Generate Data
set.seed(1234)
# Continuous
y<-rnorm(1000)
x1<-4*runif(1000)
x2<-2*rnorm(1000)
# Categorical
c1<-as.factor(ifelse(x1<=0.9,"A",
                 ifelse(x1>0.9 & x1<=2.4,"B",
                        ifelse(x1>2.4 & x1<=3.5,"C","D"))))
c2<-as.factor(ifelse(x2>2,"Y","N"))
table(c1)
> c1
> A   B   C   D 
> 201 397 268 134

table(c2)
> c2
> N   Y 
> 825 175

# Dummy dependent variable
y<-ifelse(y>0,1,0)
table(y)
> y
> 0   1 
> 517 483

probit<-glm(y ~ x1 + x2 + c1 + c2,family=binomial(link="probit"))

# AME
margins<-summary(margins(probit))
margins[,c(1,2)]
> factor     AME
>    c1B  0.0068
>    c1C  0.0620
>    c1D  0.0800
>    c2Y -0.0176
>     x1 -0.0371
>     x2  0.0037

# COMPUTE PARTIAL DENSITY FUNCTION               
pdf<-mean(dnorm(predict(probit)))

# Compute Manually AME of x1 (Continous Variable)
round(pdf*coef(probit)[2],4) # This is the same AME returned by the margins command ! 
 > x1 
 > -0.0371

#  1. Compute Manually AME of C1 and C2 for each level (Categorical Variable) using  pdf
round(pdf*coef(probit)[4],4) # AME_C1 Level B
> c1B 
> 0.0069 

round(pdf*coef(probit)[5],4) # AME_C1 Level C
> c1C 
> 0.0623 

round(pdf*coef(probit)[6],4) # AME_C1 Level D
> c1D 
> 0.0804 

# They all are slightly different to those returned my margins command

round(pdf*coef(probit)[7],4) # AME_C2 (dummy)
> c2Y 
> -0.0176

因此，我的问题是：“ margins命令是否通过对虚拟数据和分类数据使用离散更改来计算边缘效应？”

理论上，如果变量Xj是连续的，则边际概率效应为： [https://i.stack.imgur.com/OtEEx.jpg]

但是，如果Xj是虚拟对象：

[https://i.stack.imgur.com/XMwrx.jpg]

概率回归：分类变量的边际效应？

1 个答案: