sos berlin scheduler - 工作链 - 如何在工作超时后触发其他工作

时间:2017-07-28 05:45:11

标签: timeout scheduled-tasks kill job-scheduling

我正在使用sos berlin scheduler(版本linux-x64 1.10.5)。

通常,当job_chain超时作业时,调度程序将终止作业进程并发送电子邮件。 所以,基于此,我想触发其他工作。 但是,我尝试了两种方法都不起作用。

方式1:

在作业中添加“spooler_task_after()”函数。

我猜失败是因为这个作业会在linux系统上创建一个进程,而作业超时调度程序会终止这个作业进程,也会杀死函数“spooler_task_after()”

代码:

<job timeout="00:00:09">
    <script language="shell"><![CDATA[

    echo aa
    sleep 10s
    echo bb

    ]]></script>
    <monitor name="exit_code" ordering="0">
    <script language="java:javascript"><![CDATA[

    function spooler_task_after(){
       var exitCode = spooler_task.exit_code;
       spooler_log.info ("Exit Code is: " + exitCode);

       /*
        call other job
       */

       result = true;
      return result;
    }

            ]]></script>
    </monitor>
    <run_time/>
</job>

结果:

2017-07-27 21:22:21.251+0800 [info]   
2017-07-27 21:22:21.251+0800 [info]   Task sample_errorhandling/job1:23026 - Protocol starts in /httx/opt/sos-scheduler/ldw-scheduler-test1/logs/task.sample_errorhandling,job1.log
2017-07-27 21:22:21.250+0800 [info]   SCHEDULER-842  Task is going to process Order sample_errorhandling/job_chain3:12, state=aaa, on JobScheduler 'http://xxxx:4444', Order's Process_class
2017-07-27 21:22:21.268+0800 [info]   SCHEDULER-726  Task runs on this JobScheduler 'http://jt-host-kvm-72:4444'
2017-07-27 21:22:21.268+0800 [info]   SCHEDULER-918  state=starting (at=never)
2017-07-27 21:22:22.466+0800 [info]   SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/admin/sos.gBdCm8"'
2017-07-27 21:22:23.520+0800 [info]   [stdout] aa
2017-07-27 21:22:30.326+0800 [ERROR]  SCHEDULER-272  Terminating task after reaching deadline <job timeout="9">
2017-07-27 21:22:30.359+0800 [ERROR]  SCHEDULER-202  Connection to task has been lost, state=running_remote_process: Z-REMOTE-101  Separate process: pid=0: Connection lost / zschimmer::com::object_server::Connection::pop_operation
2017-07-27 21:22:30.359+0800 [ERROR]  SCHEDULER-202  Connection to task has been lost, state=release: Z-REMOTE-122  Separate process pid=0: Caller has killed process
2017-07-27 21:22:30.384+0800 [ERROR]  SCHEDULER-280  Process terminated with exit code 1 (0x63)
2017-07-27 21:22:30.384+0800 [WARN]   SCHEDULER-845  Task ended without processing the order. The order remains in job's order queue in the same state
2017-07-27 21:22:30.384+0800 [info]   SCHEDULER-843  Task has ended processing of Order sample_errorhandling/job_chain3:12, state=aaa, on JobScheduler 'http:/xxxx:4444'

方式2:

在工作链节点上添加返回代码

这种方式适用于作业成功执行或出错。但是当工作被超时杀死时失败了。

工作链中的代码:

<job_chain >
    <job_chain_node  state="aaa" job="job1" next_state="success" error_state="error">
        <on_return_codes >
            <on_return_code  return_code="1">
                <add_order  xmlns="https://jobscheduler-plugins.sos-berlin.com/NodeOrderPlugin" job_chain="/error_handling/sendmail"/>
            </on_return_code>
        </on_return_codes>
    </job_chain_node>

    <job_chain_node  state="success"/>

    <job_chain_node  state="error"/>
</job_chain>

1 个答案:

答案 0 :(得分:1)

您可以使用error_state =属性。 当JobScheduler由于超时而终止任务时,将其作为错误情况处理。

请注意,errorHandling状态的next_state是error,以在JOC中指示这是一个错误,并且errorHandling状态具有其自己的error_state来指示errorHandler本身是否失败。

 <job_chain>
     <job_chain_node  state="100" job="job1" next_state="200" error_state="errorHandling"/>
     <job_chain_node  state="200" job="job2" next_state="success" error_state="errorHandling"/>
     <job_chain_node  state="errorHandling" job="errorHandlerJob" next_state="error" error_state="errorInErrorHandling"/>
     <job_chain_node  state="success"/>
     <job_chain_node  state="errorInErrorHandling"/>
     <job_chain_node  state="error"/>
 </job_chain>