R,foreach循环,并行处理,调用宏变量

时间:2017-12-08 06:23:53

标签: r variables parallel-processing macros parallel.foreach

我试图通过替换“循环”来优化循环性能。与' foreach并行处理循环'我有大约1000个具有不同行和列的小数据帧。

我的目的是通过使用' dplyr package的bind_rows'来转换行绑定所有这些数据帧。进入矩阵。我在网上做了一些关于基础知识和foreach循环的研究。设置和'做并行'例如run a for loop in parallel in RParallel R Loops for Windows and LinuxR - parallel computing in 5 minutes (with foreach and doParallel)

以下是我的环境(数据准备)

中的更多详细信息

示例小型数据帧 - 注意:所有这些小型数据帧可能具有不同的行和列。

RYW0001_rs <- data.frame(
"A" = c("Coff", "Apple", "Coff", "Milk", "Milk", "Coff"), 
"B" = c("ToothB", "Apple", "Orange", NA, "Pear", "Grape"),
"C" = c("ToothP", "ToothP", NA, NA, "ToothB", "Yam"), 
"D" = c(NA, "Potato", NA, NA, NA, NA)
)

RYW0002_rs <- data.frame(
  "A" = c("Coff", "Apple", "Coff", "Milk", "Milk", "Coff"), 
  "B" = c(NA, "Potato", NA, NA, NA, NA)
)

RYW0003_rs <- data.frame(
  "A" = c("Coff", "Apple", "Coff", "Milk", "Milk", "Coff"), 
  "B" = c("ToothB", "Apple", "Orange", NA, "Pear", "Grape"),
  "C" = c("Apple", "ToothP", "Orange", NA, "Milk", "Grape"),
  "D" = c("ToothP", "Orange", NA, NA, "Pear", "Yam"), 
  "E" = c("ToothP", "ToothP", NA, NA, "ToothB", "Yam"), 
  "F" = c(NA, "Potato", NA, NA, NA, NA)
)

将数据框存储为字符(用作宏变量)

Merchant_No_rs1 <- c('RYW0001_rs','RYW0002_rs','RYW0003_rs')

编码1:上一个循环 [工作正常,虽然下面有一些警告信息,但不会影响我的预期结果]

Warning messages:
1: In bind_rows_(x, .id) : Unequal factor levels: coercing to character
2: In bind_rows_(x, .id) : Unequal factor levels: coercing to character
3: In bind_rows_(x, .id) : Unequal factor levels: coercing to character

第1步:创建EMPTY新temp_all文件

temp <- NULL

第2步:for循环

for (j in 1:length(Merchant_No_rs1)) {
temp <- bind_rows(temp, get(Merchant_No_rs1[[j]]))
print(dim(temp_all))
}

编码2:当前的foreach循环 [不起作用,遇到如下错误]

Error in { : task 1 failed - "object 'RYW0001_rs' not found"

第1步:创建EMPTY新临时文件

temp <- NULL

第2步:foreach循环

foreach (j=1:length(Merchant_No_rs1), .packages=c("dplyr"), .export=sprintf("%s",Merchant_No_rs1[[j]])) %dopar% {  
temp <- bind_rows(temp, get(Merchant_No_rs1[[j]]))
} 

我的预期结果与编码1的结果相同,尽管所有小数据都有不同的行和列,如果有新列,临时表中的列会追加。以下是结果表。 temp

问题:有没有办法使用&#39; foreach循环&#39;进行并行处理?但是有相同的结果,比如&#39;做循环&#39;?

任何帮助将不胜感激:)谢谢

0 个答案:

没有答案