假设我的工作已运行了一段时间,由于机器超载而暂停状态,并在一段时间后运行并完成。 现在这份工作获得的状态是RUNNING - >暂停 - > RUNNING
如何获得某项工作所获得的所有州?
答案 0 :(得分:0)
bjobs -l如果尚未从系统中清除作业。
bhist -l否则。您可能需要-n,具体取决于作业的年龄。
以下是暂停作业并稍后因系统负载暂时超过配置的阈值而恢复的bhist -l输出示例。
$ bhist -l 1168
Job <1168>, User <mclosson>, Project <default>, Command <sleep 10000>
Fri Jan 20 15:08:40: Submitted from host <hostA>, to
Queue <normal>, CWD <$HOME>, Specified Hosts <hostA>;
Fri Jan 20 15:08:41: Dispatched 1 Task(s) on Host(s) <hostA>, Allocated 1 Slot(
s) on Host(s) <hostA>, Effective RES_REQ <select[type == any] or
der[r15s:pg] >;
Fri Jan 20 15:08:41: Starting (Pid 30234);
Fri Jan 20 15:08:41: Running with execution home </home/mclosson>, Execution CW
D </home/mclosson>, Execution Pid <30234>;
Fri Jan 20 16:19:22: Suspended: Host load exceeded threshold: 1-minute CPU ru
n queue length (r1m)
Fri Jan 20 16:21:43: Running;
Summary of time in seconds spent in various states by Fri Jan 20 16:22:09
PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
1 0 4267 0 141 0 4409
在16:19:22,由于r1m超过了阈值,因此暂停了工作。后来在16:21:43,工作重新开始。