在r中一次运行多个任务

时间:2014-07-18 10:01:10

标签: r

rm(list=ls())

mat<-data.frame(matrix(NA,32,11))

mat[c(1,11,21),]<-mtcars[c(1,11,21),]

tasks <- list(

  job1 = function(){repeat{na<-which(is.na(mat1[,1]));mat1[na,]=mat1[na-1,];if(all(is.na(mat1[,1])==F)==T)break}mat1},

 job2 = function(){repeat{na<-which(is.na(mat2[,1]));mat2[na,]=mat2[na-1,];if(all(is.na(mat2[,1])==F)==T)break}mat2},

  job3 = function()repeat{na<-which(is.na(mat3[,1]));mat3[na,]=mat3[na-1,];if(all(is.na(mat3[,1])==F)==T)break}mat3},

  # To check that the computations are indeed running in parallel.

  job4 = function() for (i in 1:5) { cat("4"); Sys.sleep(1) },

  job5 = function() for (i in 1:5) { cat("5"); Sys.sleep(1) },

  job6 = function() for (i in 1:5) { cat("6"); Sys.sleep(1) }

)

mat<-data.frame(matrix(NA,32,11))

mat[c(1,11,16),]<-mtcars[c(1,11,21),]

mat1<-mat[1:10,]

mat2<-mat[11:15,]

mat3<-mat[16:32,]

cl <- makeCluster( length(tasks) )

clusterExport(cl, list("mat1","mat2","mat3"))     # make sure mtcars is loaded

out <- clusterApply( cl, tasks, function(f) f())

stopCluster(cl)

1 个答案:

答案 0 :(得分:0)

试试这个:

 cl <- makeCluster( length(tasks) )
 clusterExport(cl, "mat")
 out <- clusterApply( cl, tasks,function(f) f())

您需要使对象可用于工作进程。这是由clusterExport完成的。

此外,请注意您的功能有点奇怪。在每个工作进程中完成相同的任务,这可能不是您打算做的。请注意,只需编写mat[1:10]即可打印相应的输出,而不是其他内容。尝试例如以下

   job1 = function() {
   mat <- mat[1:10, ];              # note the difference here
   repeat{ 
   na<-is.na(mat[,1]);
   mat[na,] <-mtcars[na,];
   if(all(is.na(mat[,1]))==F) break}
   }

对其他工作进行类似的调整。然后致电

 cl <- makeCluster( length(tasks) )
 clusterExport(cl, list("mat", "mtcars"))     # make sure mtcars is loaded
 out <- clusterApply( cl, tasks, function(f) f())
 stopCluster(cl)

由于您将整个矩阵导出到每个工作进程,然后仅使用子集,因此效率仍然稍低。但我想这个原则是可以理解的。