定时分叉过程

时间:2017-07-21 22:27:17

标签: perl fork

我在多个处理器上运行蒙特卡罗,但它挂了很多。所以我把这个perl代码放在一起来杀死挂起monte carlo的迭代并转到下一次迭代。但我得到一些错误,我还没弄明白。 我认为它睡眠时间过长,它会删除out.mt0文件,然后才能查找它。 这是代码:

      my $pid = fork();
  die "Could not fork\n" if not defined $pid;

  if ($pid==0){

      print "In child\n";

        system("hspice -i mont_read.sp -o out -mt 4");wait;
        sleep(.8);wait;
        exit(0);

  }

      print "In parent \n";




$i = 0;

  $mont_number = $j - 1;

  out: while (1){

  $res=waitpid($pid, WNOHANG);

  if ($res == -1) {

      print "Successful Exit Process Detected\n";
      system("mv out.mt0 mont_read.mt0");wait;
      sleep(1);wait;
      system("perl monte_stat.pl > rel_out.txt"); wait ;
      system("cat stat_result.txt rel_out.txt > stat_result.tmp"); wait; 
      system("mv stat_result.tmp stat_result.txt");wait;
      print "\nSim #$mont_number complete\n";wait;
      last out;

  }

  if($res != -1){

    if($i>=$timeout){

      $hang_count = $hang_count+1;
      system("killall hspice");wait;
      sleep(1);
      print("time_out complete\n");wait;
      print "\nSim #$mont_number complete\n";wait;
      last out; 

    }

    if($i<$timeout){

        sleep $slept;wait;

    }

  $i=$i+1;

  }

  }

这是错误:

 Illegal division by zero at monte_stat.pl line 73, <INHSPOUT> line 2.
mv: cannot stat `out.mt0': No such file or directory
Illegal division by zero at monte_stat.pl line 73, <INHSPOUT> line 1.
mv: cannot stat `out.mt0': No such file or directory
Illegal division by zero at monte_stat.pl line 73, <INHSPOUT> line 1.
mv: cannot stat `out.mt0': No such file or directory
Illegal division by zero at monte_stat.pl line 73.
mv: cannot stat `out.mt0': No such file or directory
Illegal division by zero at monte_stat.pl line 73.
mv: cannot stat `out.mt0': No such file or directory
mv: cannot stat `out.mt0': No such file or directory
mv: cannot stat `out.mt0': No such file or directory
Illegal division by zero at monte_stat.pl line 73, <INHSPOUT> line 3.
mv: cannot stat `out.mt0': No such file or directory
Illegal division by zero at monte_stat.pl line 73, <INHSPOUT> line 1.
mv: cannot stat `out.mt0': No such file or directory

任何人都可以告诉我在哪里调试它。 感谢

1 个答案:

答案 0 :(得分:3)

根据错误,您的hslice会崩溃。但还有其他问题。

这是第一个尽可能接近您的代码的工作示例。

use warnings;
use strict;
use feature 'say';
use POSIX qw(:sys_wait_h);
$| = 1;

my ($timeout, $duration, $sleep_time) = (5, 10, 1);

my $pid = fork // die "Can't fork: $!";

if ($pid == 0)  
{
    exec "echo JOB STARTS; sleep $duration; echo JOB DONE";
    die "exec shouldn't return: $!";
}    
say "Started $pid";
sleep 1;

my $tot_sec;    
while (1) 
{
    my $ret = waitpid $pid, WNOHANG;

    if    ($ret > 0) { say "Child $ret exited with: $?";  last; }
    elsif ($ret < 0) { say "\nNo such process ($ret)";    last; }
    else             { print " . " }

    sleep $sleep_time;

    if (($tot_sec += $sleep_time) > $timeout) {
        say "\nTimeout. Send 15 (SIGTERM) signal to the process.";
        kill 15, $pid;
        last;
    }   
}

$duration(作业)设置为3,短于$timeout,我们

Started 16848
JOB STARTS
 .  .  . JOB DONE
Child (JOB) 16848 exited with: 0

$duration设置为10时,我们得到

Started 16550
JOB STARTS
 .  .  .  .  .
Timeout. Send 15 (SIGTERM) signal to the process.

并且工作被杀死(等待5秒钟 - JOB DONE不应该出现)。

对问题中代码的评论

  • 如果fork只是为了完成工作,则没有system的理由。只需exec该程序

  • system之后无需wait,这是错误的。 system包括等待

  • wait不属于printsleep,而且错误

  • 无需为killall支持以杀死进程

  • 如果您最终使用system,程序将在另一个PID的新进程中运行。然后需要更多来找到PID并杀死它。请参阅Proc::ProcessTablethis post,例如

  • 上面的代码需要检查进程是否确实被杀死

替换您的命令行而不是echo ...,并根据需要添加对它的检查。

另一种选择是简单地睡眠$timeout期,然后检查作业是否完成(退出孩子)。但是,通过您的方法,您可以在轮询时执行其他操作。

另一种选择是使用alarm