我有一个脚本从文件(param.txt)中读取参数,并为每个参数组合运行R代码(myRcode)(共40个)。我的mac有8个内核,所以我希望脚本运行8个作业(当一个完成时,启动另一个,依此类推)。我目前拥有的脚本是:
#!/bin/bash
while read param1 param2 param3
do
nohup R --no-save > output_${param1}.txt << EOP &
source("MyRProgram.R");
myRcode(${param1},${param2},${param3})
EOP
echo "JobID = $! for parameters seed=${param1} n=${param2} submitted on `date`"
done < param.txt
exit
如果我运行./Myscript,则所有40个作业同时运行。我知道我可以编写8个单独的脚本(每个读取来自不同文件的参数),其中每个R代码在脚本内部用&#34;;&#34; - 使它们在每个脚本中按顺序运行。是否有更好的方法只涉及一个脚本?
答案 0 :(得分:0)
没有必要,只需使用梦幻般的GNU Parallel。可用here。尝试一次运行8个作业:
#!/bin/bash
#
# Make a little job to do - nothing too tough!
echo sleep 5 > job
chmod +x job
# Now run those puppies
parallel -k -j 8 <<EOF
./job; date +'%H:%M:%S Job1 done'
./job; date +'%H:%M:%S Job2 done'
./job; date +'%H:%M:%S Job3 done'
./job; date +'%H:%M:%S Job4 done'
./job; date +'%H:%M:%S Job5 done'
./job; date +'%H:%M:%S Job6 done'
./job; date +'%H:%M:%S Job7 done'
./job; date +'%H:%M:%S Job8 done'
./job; date +'%H:%M:%S Job9 done'
./job; date +'%H:%M:%S Job10 done'
./job; date +'%H:%M:%S Job11 done'
./job; date +'%H:%M:%S Job12 done'
./job; date +'%H:%M:%S Job13 done'
./job; date +'%H:%M:%S Job14 done'
./job; date +'%H:%M:%S Job15 done'
./job; date +'%H:%M:%S Job16 done'
EOF
或者,如果你不喜欢安装GNU Parallel的想法,你可以这样做:
#!/bin/bash
MAX=8
j=0
while read param1 param2 param3
do
nohup R --no-save > output_${param1}.txt << EOP &
source("MyRProgram.R");
myRcode(${param1},${param2},${param3})
EOP
((j++))
if [ $j -eq $MAX ]; then
echo -n Pausing with $MAX processes...
j=0
wait
fi
done < param.txt
wait
顺便说一句,您可以通过以下方式获取Mac上的核心数量:
sysctl -n hw.logicalcpu
或
parallel --number-of-cores
答案 1 :(得分:0)
使用GNU Parallel:
#!/bin/bash
Rjob() {
param1=$1
param2=$2
param3=$3
echo "JobID = $! for parameters seed=${param1} n=${param2} submitted on `date`"
R --no-save > output_${param1}.txt << EOP
source("MyRProgram.R");
myRcode(${param1},${param2},${param3})
EOP
}
export -f Rjob
cat param.txt | parallel --colsep '\s' Rjob {1} {2} {3}