Question

我有一个简单的数据集，我应用了一个简单的线性回归模型。现在我想使用固定效果来更好地预测模型。我知道我也可以考虑制作虚拟变量，但我的真实数据集包含更多年份并且有更多变量，所以我想避免制作假人。

我的数据和代码与此类似：

data <- read.table(header = TRUE, 
                   stringsAsFactors = FALSE, 
                   text="CompanyNumber ResponseVariable Year ExplanatoryVariable1 ExplanatoryVariable2
                   1 2.5 2000 1 2
                   1 4 2001 3 1
                   1 3 2002 5 7
                   2 1 2000 3 2
                   2 2.4 2001 0 4
                   2 6 2002 2 9
                   3 10 2000 8 3")

library(lfe)
library(caret)
fe <- getfe(felm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 | Year))
fe
lm.1<-lm(ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2, data=data)                                   


prediction<- predict(lm.1, data) 
prediction

check_model=postResample(pred = prediction, obs = data$ResponseVariable)
check_model

对于我的真实数据集，我将根据我的测试集进行预测，但为了简单起见，我也只是在这里使用训练集。

我想借助我发现的固定效果做出预测。但它似乎与固定效果不匹配，谁知道如何使用此fe$effects？

prediction_fe<- predict(lm.1, data) + fe$effect

Answer 1

以下是您的设置和正在运行的模型的一些额外注释。

您正在使用的主要模型是

lm.1<-lm(ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2, data=data)

产生

> lm.1
Call:
lm(formula = ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2, 
    data = data)

Coefficients:
         (Intercept)  ExplanatoryVariable1  ExplanatoryVariable2  
              0.8901                0.7857                0.1923

当您在此模型上运行predict功能时，您将获得

> predict(lm.1)
       1        2        3        4        5        6        7 
2.060385 3.439410 6.164590 3.631718 1.659333 4.192205 7.752359

这对应于计算（对于观察1）：0.8901 + 1 * 0.7857 + 2 * 0.1923，因此估计的固定效应用于预测。 felm模型稍微复杂一些，因为它“推算”年份组件。模型拟合显示在这里

> felm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 | Year)
ExplanatoryVariable1 ExplanatoryVariable2 
              0.9726               1.3262

现在这对应于“更正”或调整Year，以便在适合时获得相同的结果

> lm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 + factor(Year))

Call:
lm(formula = ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 + 
    factor(Year), data = data)

Coefficients:
         (Intercept)  ExplanatoryVariable1  ExplanatoryVariable2      factor(Year)2001  
             -2.4848                0.9726                1.3262                0.9105  
    factor(Year)2002  
             -7.0286

然后只丢掉解释变量的所有系数。因此，你不能从felm提取的固定效果中获得并获得预测（因为你缺少截距和所有年份效果） - 你只能看到效果大小。

希望这有帮助。

使用固定效果预测

1 个答案: