Question

我想要在plm模型的残差上应用一些基本的计算，但我仍然坚持如何自动化大量数据的步骤。

假设输入是data.frame（df），其中包含以下数据：

Id          Year    Population  Y       X1          X2          X3
country A   2009    977612  212451.009  19482.7995  0.346657979 0.001023221
country A   2010    985332  221431.632  18989.3     0.345142551 0.001015205
country A   2011    998211  219939.296  18277.79286 0.344020453 0.001002106
country A   2012    1010001 218487.503  17916.2765  0.342434314 0.000990409
country B   2009    150291  177665.268  18444.04522 0.330864789 0.001940218
country B   2010    150841  183819.407  18042       0.327563461 0.001933143
country B   2011    152210  183761.566  17817.3515  0.32539255  0.001915756
country B   2012    153105  182825.112  17626.62261 0.321315437 0.001904557
country c   2009    83129   132328.034  17113.64268 0.359525557 0.005862866
country c   2010    83752   137413.878  16872.5     0.357854141 0.005819254
country c   2011    84493   136002.537  16576.17856 0.356479235 0.005768219
country c   2012    84958   133064.911  16443.3057  0.355246122 0.005736648

应用模型并存储残差：

    fixed <- plm(Y ~ Y1 + X2 + X3,
           data=df, drop.unused.levels = TRUE, index=c("Id", "Year"), model="within")
residuals <- resid(fixed)

在下一步中，我想计算加权平均值＆＃34;我的遗留物：

Residuals Formula

对于国家i中的人口，在时间t，n代表总人口在t。

到目前为止，我的方法是：

首先，我计算每年的总人口数：

year_range <- seq(from=2009,to=2012,by=1)
tot_pop = NULL
for (n in year_range)
{
  tot_pop[n] = with(df, sum(Population[Year == n]))
}

在取得＆＃34;加权＆＃34;之前残差，我的下一步是自动计算我的＆＃34; new＆＃34;残差：

res1 <- df$Population[1]/tot_pop[2009] * residuals[1]
res2 <- df$Population[2]/tot_pop[2010] * residuals[2]
res3 <- df$Population[3]/tot_pop[2011] * residuals[3]
...
res12 <- df$Population[12]/tot_pop[2011] * residuals[12]

编辑：将JTT解决方案应用于我的问题，最后一步是：

year_range1 <- rep(year_range, 3)
df_res <- data.frame(year = year_range1, res=as.vector(res))
aggr_res <- aggregate(df_res$res, list(df_res$year), sum)
colnames(aggr_res) <- c("Year", "Aggregated residual")

这是对的吗？

我尝试过lapply函数和一个double＆＃34; for-loop＆＃34;没有成功。我不知道该怎么做。非常感谢您的帮助。如果我的问题不清楚，请发表评论，我会尽力改进。

Answer 1

首先，您可能希望使用聚合函数来计算总人口，而不是for循环，例如：

a<-aggregate(df$Population, list(df$Year), sum)

注意a（Group.1和x）的列名。

然后，您可以使用a功能将df中的结果与match()中的数据进行匹配。它给出了匹配的行号，可以在乘以残差之前将数据从df子集到除法。例如：

res<-df$Population/a$x[match(df$Year, a$Group.1)]*residuals

现在你应该在对象res中有一个“新”残差的向量。

使用R中的残差自动进行基本计算

1 个答案: