我已经设置了一个mpi集群。它使用openmpi在centos上运行。 我正在尝试运行R作业,但是即使作业运行正常,也会导致错误。这对我没有任何意义。任何想法为什么会这样?我在R代码上没有正确停止mpi吗?
dosnow.r
#!/usr/bin/env Rscript
hello.world <- function(i) {
sprintf('Hello from loop iteration %d running on rank %d on node %s',
i, mpi.comm.rank(), Sys.info()[c("nodename")]);
}
library(foreach)
library(snow)
library(doSNOW)
cl <- makeMPIcluster( 3 )
registerDoSNOW(cl)
output.lines <- foreach(i = (1:10)) %dopar% {
hello.world(i)
}
cat(unlist(output.lines), sep='\n')
stopCluster(cl)
的mpirun
mpirun --hostfile wwhosts R CMD BATCH dosnow.r
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 3200 on
node wwmaster exiting improperly. There are three reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.
This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
You can avoid this message by specifying -quiet on the mpirun command line.
--------------------------------------------------------------------------
溃败:
R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
> #!/usr/bin/env Rscript
>
> hello.world <- function(i) {
+ sprintf('Hello from loop iteration %d running on rank %d on node %s',
+ i, mpi.comm.rank(), Sys.info()[c("nodename")]);
+ }
>
> library(foreach)
> library(snow)
> library(doSNOW)
Loading required package: iterators
>
> cl <- makeMPIcluster( 3 )
Loading required namespace: Rmpi
3 slaves are spawned successfully. 0 failed.
> registerDoSNOW(cl)
>
> output.lines <- foreach(i = (1:10)) %dopar% {
+ hello.world(i)
+ }
>
> cat(unlist(output.lines), sep='\n')
Hello from loop iteration 1 running on rank 1 on node n0001.cluster
Hello from loop iteration 2 running on rank 2 on node wwmaster
Hello from loop iteration 3 running on rank 3 on node n0000.cluster
Hello from loop iteration 4 running on rank 1 on node n0001.cluster
Hello from loop iteration 5 running on rank 2 on node wwmaster
Hello from loop iteration 6 running on rank 3 on node n0000.cluster
Hello from loop iteration 7 running on rank 3 on node n0000.cluster
Hello from loop iteration 8 running on rank 1 on node n0001.cluster
Hello from loop iteration 9 running on rank 2 on node wwmaster
Hello from loop iteration 10 running on rank 1 on node n0001.cluster
>
> stopCluster(cl)
[1] 1
>
> proc.time()
user system elapsed
0.835 0.290 4.177