如何使R脚本在mosix群集上并行运行?

时间:2020-01-08 21:19:11

标签: r mosix

我正在尝试重新创建this paper第3部分中给出的示例,该示例跨集群管理的多个实例执行简单的计算。主要计算发生在此脚本“ sim.R”中:

# sim.R
# If the "batch" package has not been installed, run the line below:
# install.packages("batch", repos = "http://cran.cnr.Berkeley.edu")
seed <- 1000
n <- 50
nsim <- 10000
mu <- c(0, 0.5)
sd <- c(1, 1)
library("batch")

parseCommandArgs()
set.seed(seed)
pvalue <- rep(0,nsim)

for(i in 1:nsim) {
        X <- rnorm(n = n, mean = mu[1], sd = sd[1])
        Y <- rnorm(n = n, mean = mu[2], sd = sd[2])
        pvalue[i] <- t.test(X, Y)$p.value
}
power <- mean(pvalue <= 0.05)

out <- data.frame(seed = seed, nsim = nsim, n = n,
        mu = paste(mu, collapse = ","),
        sd = paste(sd, collapse = ","), power = power)
outfilename <- paste("res", seed, ".csv", sep = "")
print(out)
write.csv(out, outfilename, row.names = FALSE)

要运行sim.R的多个并行实例,还有另一个脚本“ param-sim.R”

library("batch")
seed <- 1000
for(i in 1:10) {
        seed <- rbatch("sim.R", seed = seed, n = 25, mu = c(0, i / 10))
        rbatch.local.run() # My understanding from the linked paper is that this line will do nothing if the script is run on a mosix cluster and not locally.
}

要在mosix群集上运行此命令,请在终端上使用以下命令:

R --vanilla --args RBATCH mosix < param-sim.R

我希望此输出生成10个.csv文件,标记为res1000.csv-res1009.csv。相反,这是我得到的(我在Ubuntu环境中运行此命令):

$ R --vanilla --args RBATCH mosix < param-sim.R

R version 3.4.4 (2018-03-15) -- "Someone to Lean On"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library("batch")
> seed <- 1000
> for(i in 1:10) {
+   seed <- rbatch("sim.R", seed = seed, n = 25, mu = c(0, i / 10))
+   rbatch.local.run()
+ }
nohup mosrun -e -b -q R --vanilla --args  seed 1000 n 25 mu "c(0,0.1)" < sim.R > sim.Rout1000 & 
rbatch.local.run: no commands have been batched.
nohup: redirecting stderr to stdout
nohup mosrun -e -b -q R --vanilla --args  seed 1001 n 25 mu "c(0,0.2)" < sim.R > sim.Rout1001 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1002 n 25 mu "c(0,0.3)" < sim.R > sim.Rout1002 & 
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1003 n 25 mu "c(0,0.4)" < sim.R > sim.Rout1003 & nohup: 
redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1004 n 25 mu "c(0,0.5)" < sim.R > sim.Rout1004 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1005 n 25 mu "c(0,0.6)" < sim.R > sim.Rout1005 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1006 n 25 mu "c(0,0.7)" < sim.R > sim.Rout1006 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1007 n 25 mu "c(0,0.8)" < sim.R > sim.Rout1007 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1008 n 25 mu "c(0,0.9)" < sim.R > sim.Rout1008 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
nohup mosrun -e -b -q R --vanilla --args  seed 1009 n 25 mu "c(0,1)" < sim.R > sim.Rout1009 & 
nohup: redirecting stderr to stdout
rbatch.local.run: no commands have been batched.
> 
nohup: redirecting stderr to stdout

没有生成.csv文件,并且每个输出文件(即sim.Rout1000)都包含相同的信息:

mosrun - MOSIX Version 4.3.4
Usage: mosrun [location-options] [program-options] {program} [args]...
       mosrun -S{maxjobs} [location-options] [program-options]
                                                {commands-file}[,{failed-file}]
       mosrun -R{filename} [-O{fd=filename}][,{fd2=fn2}]... [location-options]

       mosrun -I{filename}

  Location options - Node specification:
        -b                      try to start on 'best' available node
        -r{hostname}            start on given host
        -{a.b.c.d}              start on the node of given IP address
        -{n}                    start on given logical node number
        -h                      start on home node
  Other location options:
        -F                      do not fail if requested node is not available
        -L                      lock, disallow automatic migration
        -l                      unlock, allowing automatic migration
        -g                      disallow automatic freezing
        -G                      allow automatic freezing
        -m{mb}                  try to run only on nodes with >= mb free memory
        -A {minutes}            auto checkpoint interval in minutes (0-10000000)
        -N {max}                max. # of checkpoints before cycle (0-10000000)
  Program options:
        -e                      unsupported system calls produce -1/errno=ENOSYS
        -w                      as -e, but print warnings for unsupported calls
        -u                      unsupported system calls kill mosrun (default)
        -d {0-10000}            specify decay rate per second in parts of 10000
        -c                      consider program as a pure CPU job (ignore I/O)
        -n                      reverse '-c', so to include I/O considerations
        -C{filename}            test given checkpoint file
        -X{/directory}          declare private directory
        -z                      program arguments start at argument #0 (not #1)

这使我认为程序从未运行或进入集群队列。我还使用“ top”命令检查了系统进程,但未发现任何内容。作为记录,我已经能够在mosix集群上成功运行简单的C ++程序。

我是否错过了允许该程序正常运行的关键细节?

0 个答案:

没有答案