我尝试对不同的数据集进行一些回归。
加载库+数据
libraries = c('AER','plm','dplyr','lmtest','foreach','doParallel')
lapply(libraries, require, character.only = TRUE)
no_cores <- detectCores() - 8
cl <- makeCluster(no_cores, type="PSOCK")
registerDoParallel(cl)
data("Fatalities")
df1 = Fatalities
df2 = Fatalities %>% filter(spirits>1.3)
独立回归 首先,我在循环外运行回归以测试动态变量引用,并且它按预期工作。
yvar= as.symbol('fatal')
xvars = rlang::parse_expr("beertax + baptist + income + emppop")
data_to_use = as.symbol("df1")
fatal_fe_mod <- eval(bquote(plm( .(yvar) ~ .(xvars), data = .(data_to_use), index = c("state", "year"), model = "within", effect= "time")))
summary(fatal_fe_mod)
mod_summary = eval(bquote(coeftest(fatal_fe_mod, vcov=vcovHC(fatal_fe_mod,type="HC0",cluster="group")))) # clustered errors
mod_summary
常规循环 接下来,我在常规的for循环中运行回归,该回归同样有效。
datasets = c("df1", "df2")
for(num_spec in c(1:2)) {
print(num_spec)
yvar= as.symbol('fatal')
xvars = rlang::parse_expr("beertax + baptist + income + emppop")
data_to_use = as.symbol(datasets[num_spec])
fatal_fe_mod <- eval(bquote(plm( .(yvar) ~ .(xvars), data = .(data_to_use), index = c("state", "year"), model = "within", effect= "time")))
summary(fatal_fe_mod)
mod_summary = eval(bquote(coeftest(fatal_fe_mod, vcov=vcovHC(fatal_fe_mod,type="HC0",cluster="group")))) # clustered errors
print(mod_summary)
}
并行循环
不幸的是,并行循环失败。它立即返回错误。看来问题与.(data_to_use)
步骤有关。在并行循环内部,R似乎无法访问数据帧。
foreach(num_spec=1:2, .packages=libraries) %dopar% {
yvar= as.symbol('fatal')
xvars = rlang::parse_expr("beertax + baptist + income + emppop")
data_to_use = as.symbol(datasets[num_spec])
fatal_fe_mod <- eval(bquote(plm( .(yvar) ~ .(xvars), data = .(data_to_use), index = c("state", "year"), model = "within", effect= "time")))
summary(fatal_fe_mod)
mod_summary = eval(bquote(coeftest(fatal_fe_mod, vcov=vcovHC(fatal_fe_mod,type="HC0",cluster="group")))) # clustered errors
mod_summary
}
最后一个立即返回:
Error in { : task 1 failed - "object 'df1' not found"