我想控制包含超过一百个级别的因子变量,而不将该控件的结果输出到摘要表。请注意,我也有兴趣复制Stata命令的速度,而不仅仅是对输出的外观改变。
在Stata我可以像这样使用“吸收”:
use http://www.stata-press.com/data/r14/abdata.dta, clear
. xtreg n w k i.year, fe
Fixed-effects (within) regression Number of obs = 1,031
Group variable: id Number of groups = 140
R-sq: Obs per group:
within = 0.6277 min = 7
between = 0.8473 avg = 7.4
overall = 0.8346 max = 9
F(10,881) = 148.56
corr(u_i, Xb) = 0.5666 Prob > F = 0.0000
------------------------------------------------------------------------------
n | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
w | -.2731482 .0551503 -4.95 0.000 -.3813896 -.1649068
k | .5648036 .0212211 26.62 0.000 .5231537 .6064535
|
year |
1977 | -.0347963 .0188134 -1.85 0.065 -.0717206 .0021281
1978 | -.0583286 .0190916 -3.06 0.002 -.0957989 -.0208583
1979 | -.070047 .0190414 -3.68 0.000 -.1074187 -.0326752
1980 | -.0889378 .0189788 -4.69 0.000 -.1261867 -.0516889
1981 | -.1401502 .0186309 -7.52 0.000 -.1767163 -.1035841
1982 | -.1603768 .0188132 -8.52 0.000 -.1973008 -.1234528
1983 | -.1621103 .0222902 -7.27 0.000 -.2058585 -.1183621
1984 | -.1258136 .0282391 -4.46 0.000 -.1812373 -.0703899
|
_cons | 2.255419 .1772614 12.72 0.000 1.907515 2.603323
-------------+----------------------------------------------------------------
sigma_u | .64723143
sigma_e | .12836859
rho | .96215208 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(139, 881) = 126.32 Prob > F = 0.0000
使用吸收会消除固定效果
. reghdfe n w k, absorb(id year)
(converged in 7 iterations)
HDFE Linear regression Number of obs = 1,031
Absorbing 2 HDFE groups F( 2, 881) = 362.67
Prob > F = 0.0000
R-squared = 0.9922
Adj R-squared = 0.9908
Within R-sq. = 0.4516
Root MSE = 0.1284
------------------------------------------------------------------------------
n | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
w | -.2731482 .0551503 -4.95 0.000 -.3813896 -.1649068
k | .5648036 .0212211 26.62 0.000 .5231537 .6064535
-------------+----------------------------------------------------------------
Absorbed | F(147, 881) = 120.660 0.000 (Joint test)
------------------------------------------------------------------------------
Absorbed degrees of freedom:
---------------------------------------------------------------+
Absorbed FE | Num. Coefs. = Categories - Redundant |
-------------+-------------------------------------------------|
id | 140 140 0 |
year | 8 9 1 |
---------------------------------------------------------------+
答案 0 :(得分:1)
我不知道有这样做的内置方法,但broom::tidy
加上一些基于因子名称的过滤将做你想做的事情:
示例数据:
set.seed(101)
dd <- data.frame(y=rnorm(1000),
f=factor(sample(1:50,size=1000,replace=TRUE)),
x=rnorm(1000))
m <- lm(y~f+x,data=dd)
一种方式(取决于grepl()
,它是基础R [我更熟悉],这是混合和匹配范式一点点)
library(broom)
library(dplyr)
tidy(m) %>%
filter(!grepl("^f[0-9]+",term))
## term estimate std.error statistic p.value
## 1 (Intercept) -0.22643955 0.18852186 -1.201131 0.2299999
## 2 x -0.03330846 0.03101449 -1.073964 0.2831116
或者您可以使用stringr::str_detect
library(stringr)
tidy(m) %>%
filter(!str_detect(term,"^f[0-9]+"))
我使用的特定正则表达式基于因子的名称加上级别的名称。在您的情况下,如果您感到幸运,它将是"^year[0-9]+"
,或仅"^year"
答案 1 :(得分:1)
我能找到的最佳替代方案是lfe包,它实现了具有高维固定效果或/和工具变量的模型。
您可以在垂直条之后指定固定效果,如下所示:
felm(n ~ w _ k | year, df)
年度系数将在最终规范中被吸收。这种方法的问题在于它现在允许您预测观察结果。