Question

proc glm可以轻松添加固定效果，而无需为类变量的每个可能值创建虚拟变量。

proc reg能够计算稳健（白色）标准错误，但它需要您创建单独的虚拟变量。

有没有办法将这些功能结合起来？我希望能够添加许多类变量并在输出中接收White标准错误。例如：

使用proc glm，我可以做回归。无论类变量中包含多少级别，这都将给出正确的结果，但它不会计算稳健的标准错误。

proc glm data=ds1;
  class class1 class2 class3;
  weight n;
  model y = c class1 class2 class3 / solution;
run;

proc reg，我可以这样做：

proc reg data=ds2;
  weight n;
  model y = x / white;
run;

其中包含白色标准错误，但未包含固定效果。为此，我可能需要50个或更多虚拟变量以及model y = x class1_d1 class1_d2 ... class3_dn /white;之类的模型语句。如果我开始添加交互术语，会变成疯狂的数字或虚拟变量。

显然我可以编写一个宏来创建虚拟变量，但这似乎是一个基本的功能，我不禁想到我错过了一些明显的东西（STATA和R都有办法轻松完成这个）。为什么我不能在class中使用proc reg语句或从proc glm中获取强大的标准错误？

Answer 1

我想我找到了答案的一部分，虽然我会对其他解决方案或调整感兴趣。

proc glmmod可用于为proc reg创建数据集：

proc glmmod noprint outdesign=ds2 data=ds1;
  class class1 class2 class3;
  weight n;
  model y = c class1 class2 class3;
run;

proc reg data=ds2;
  weight n;
  model y = col2-col50 / white;
run;

proc glmmod使用GLM语法并输出一个回归数据集，其中包含proc reg所需的所有虚拟变量。

不像单PROC解决方案那样干净（你必须跟踪标签以查看ColXX所指的内容），但它似乎完美无缺。

Answer 2

我认为你可以：（1）删除缺失变量的观察结果（2）使用proc标准贬低自变量（3）对贬值的自变量

的因变量进行回归

http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node60.html http://pages.stern.nyu.edu/~adesouza/sasfinphd/index/node61.html

上述程序的系数与proc glm（Frisch-Waugh定理）的系数完全相同。但是，你不必创造假人（这是你的主要问题）。要获得可靠的标准错误，您可以在步骤（3）中使用proc reg并使用白色标准错误。

希望有所帮助。

Answer 3

我想我对此有一个答案（或者至少，如果我没有，我可能会通过在此处发布我的解决方案来找到答案）。

根据this page，可以通过对数据进行聚类，从而使每个观测值都是其自己的聚类，从而使用 procsurveyreg 计算稳健标准误差。像这样：

data mydata;
set mydata;
counter=_n_;
run;

proc surveyreg data=mydata;
cluster counter;
model y=x;
run;

但是 procsurveyreg 需要一个 class 语句，这样就可以运行例如

proc surveyreg data=mydata;
class t;
cluster counter;
model y= t x*t / solution;
run;

使用稳健（白色）标准误差和CLASS变量进行回归以获得固定效果

3 个答案: