完成grep qstat输出和复制文件

时间:2017-06-14 15:31:03

标签: bash cluster-computing pbs

我在群集上使用PBS作业调度程序。在bash中,我想监视作业状态,一旦完成作业,我想将结果复制到 某个位置(/ data / myfolder /)

我的qstat输出如下所示:

    JobID  Username Queue Jobname SessID NDS TSK Memory Time  Status 
    ----------------------------------------------------------------
    717.XXXXXX  user XXXX       SS  2323283 1  24  122gb --     E   

提前致谢

2 个答案:

答案 0 :(得分:1)

您只需使用grep查找" C ",但您也可以使用-o [hostname:]path流式传输到最终目的地,只要您从POSIX的节点设置了ssh密钥即可帐户。

如果你最终做了grep,你应该是一个好公民,并将你的检查频率限制在每分钟一到两次,以免造成服务器垃圾邮件,这会影响性能。

答案 1 :(得分:1)

有一个脚本here可以执行此操作(适用于SGE)。我开始为您摘录相关部分,但是您可能更容易从完整脚本开始,只需在qsub函数中插入submit_job命令,然后将代码放入想要在脚本中的wait_job_finish命令之后复制结果。如果需要,可以在最后删除日志打印。

#!/bin/bash

# this script will submit a qsub job and check on host information for the cluster
# node which it ends up running on
# ~~~~~ CUSTOM FUNCTIONS ~~~~~ #
submit_job () {
    local job_name="$1"
    qsub -j y -N "$job_name" -o :${PWD}/ -e :${PWD}/ <<E0F
set -x
hostname
cat /etc/hosts
python -c "import socket; print socket.gethostbyname(socket.gethostname())"
# sleep 5000
E0F
}

wait_job_start () {
    local job_id="$1"
    printf "waiting for job to start"
    while ! qstat | grep "$job_id" | grep -Eq '[[:space:]]r[[:space:]]'
    do
        printf "."
        sleep 1
    done
    printf "\n\n"

    local node_name="$(get_node_name "$job_id")"
    printf "Job is running on node $node_name \n\n"
}

wait_job_finish () {
    local job_id="$1"
    printf "waiting for job to finish"
    while qstat | grep -q "$job_id"
    do
        printf "."
        sleep 1
    done
    printf "\n\n"
}

check_for_job_submission () {
    local job_id="$1"
    if ! qstat | grep -q "$job_id" ; then
        echo "its there"
    else
        echo "not there"
    fi
}

get_node_name () {
    local job_id="$1"
    qstat | grep "$job_id" | sed -e 's|^.*[[:space:]]\([a-zA-Z0-9.]*@[^ ]*\).*$|\1|g'
}
# ~~~~~ RUN ~~~~~ #
printf "Submitting cluster job to get node hostname and IP\n\n"

job_name="get_node_hostnames"
job_id="$(submit_job "$job_name")" # Your job 832606 ("get_node_hostnames") has been submitted
job_id="$(echo "$job_id" | sed -e 's|.*[[:space:]]\([[:digit:]]*\)[[:space:]].*|\1|g' )"
job_stdout_log="${job_name}.o${job_id}"

printf "Job ID:\t%s\nJob Name:\t%s\n\n" "$job_id" "$job_name"

wait_job_start "$job_id"
wait_job_finish "$job_id"

printf "\n\nReading log file ${job_stdout_log}\n\n"
[ -f "$job_stdout_log" ] && cat "$job_stdout_log"
printf "\n\nRemoving log file ${job_stdout_log}\n\n"
[ -f "$job_stdout_log" ] && rm -f "$job_stdout_log"

旁注:如果您喜欢Python,则会有一个更强大的等效here

你可能不得不对两者进行一些调整,以便为你的PBS系统进行调整,因为这是为SGE编写的。