Question

我在想，是否有可能为观察样本计算与用于线性回归的子样本不同的拟合值。特别是，我有一个完整的数据框，其中包含400个人。我想执行两个单独的OLS回归，根据虚拟对象的值对数据帧进行二次采样。

ols1<-lm(log_consumption ~ log_wage + Age + Age2 + Education, data=df,  subset = type==1)
ols2<-lm(log_consumption ~ log_wage + Age + Age2 + Education, data=df, subset = type==0)

此代码显然为我返回了两个单独的模型和相应的拟合值。但是，我想首先根据模型1，然后根据模型2，获取我所有数据框的拟合值（即，所有400个人的拟合值）。基本上，我想利用在两种不同的“制度”下获得的OLS系数之间的差异。

有没有办法在R中做到这一点？

感谢您的帮助，马可

Answer 1

您似乎想predict()。尝试：predict(ols1, df)和predict(ols2, df)。这是一个使用虹膜数据集的示例。

## data  
df <- iris
df$type <- rep(c(0, 1), 75) # 75 type 0 and 75 type 1

## models
ols1 <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width,
           data = df, subset = type == 1)
ols2 <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width,
           data = df, subset = type == 0)

## predicted values for all the 150 observations
# just for checking: fitted(ols1) and fitted(ols2) give the 75 fitted values
length(fitted(ols1))
length(fitted(ols2))
# here, we want predicted values instead of fitted values
# the the predict() function, we can obtained predicted values for all the 150 observations
predict(ols1, df)
predict(ols2, df)
# check: we have 150 observations
length(predict(ols1, df))
length(predict(ols2, df))

来自R

1 个答案: