守护进程的MCE进程的状态会影响父进程

时间:2015-09-01 16:58:21

标签: linux perl parallel-processing

我正在使用MCE做一些事情并且它一直运作良好。我需要注意事件发生然后派一个MCE进程来处理该事件。这很好用,但是当我认为只有子MCE进程受到影响时,我遇到了一个问题,即子进程中的错误会导致父进程失效。这是一个演示此行为的简短程序。

#!/usr/bin/perl

use strict;
use warnings;

use MCE::Loop;
use MCE::Signal '-setpgrp';
use POSIX "setsid";

$SIG{CHLD} = 'IGNORE';

my $mce_maxWorkers = 2;
my $mce_chunkSize = 1;
my @pids;
my $i = 0;

my $name = shift;

while ($i < 2) {
    my $pid = fork();

    if (!defined $pid) {
        print "Can't fork: $!\n";
    }

    elsif ($pid == 0) {

        #setpgrp(0,0);
        (setsid() != -1) || die "Can't start a new session: $!";

        MCE::Loop::init {
            max_workers => $mce_maxWorkers,
            chunk_size => $mce_chunkSize,
            on_post_exit => sub {
                my ($mce, $e) = @_;
                print "$e->{wid}: $e->{pid}: status $e->{status}: $e->{msg}\n";
            }
        };

        my $tail = 'tail -q -f '.$name;
        open my $tail_fh, "-|", $tail or die "Can't open tail\n";

        mce_loop_f {
            my ($mce, $chunk_ref, $chunk_id) = @_;
            my $line = ${$chunk_ref}[0];
            chomp($line);
            print $line."\n";

        } $tail_fh;
        close $tail_fh;

        MCE->shutdown;
        exit;
    }

    else {
        print $pid."\n";
        $i++;
        push(@pids,$pid);
    }
}

foreach my $p (@pids) {
    waitpid $p, 0;
}

运行时,此程序会分叉两个子进程,这些进程会尾随文件并使用具有两个工作进程的MCE循环读取其内容。这导致7个进程,1个父进程,2个MCE管理器和4个MCE工作程序(以及2个尾部进程)。

使用setsid,MCE管理器进程应与父进程分离。导致这些孩子死亡的任何事情都不应该影响父进程的正确性吗?

以下是ps -efj |的结果grep monitor

user1  29001   978 29001   978  0 11:41 pts/2    00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1  29002 29001 29002 29002  0 11:41 ?        00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1  29003 29001 29003 29003  0 11:41 ?        00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1  29004 29002 29002 29002  0 11:41 ?        00:00:00 tail -q -f tmp/monitor1/test1.log
user1  29005 29003 29003 29003  0 11:41 ?        00:00:00 tail -q -f tmp/monitor1/test1.log
user1  29006 29002 29002 29002  0 11:41 ?        00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1  29007 29002 29002 29002  0 11:41 ?        00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1  29008 29003 29003 29003  0 11:41 ?        00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1  29009 29003 29003 29003  0 11:41 ?        00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log

如果我要将SIGTERM发送到上面的29002,我希望这个过程与29004,29006和29007一起消失。我也希望过程29001和29003不受影响。

然而,我看到的是29001与29002一起死亡,而29003仍然存在。在终端上观察到以下错误。

shell $ ./monitor1_test.pl tmp/monitor1/test1.log 
29002
29003
test1234
test1234

## monitor1_test.pl: caught signal (INT), exiting

Killed
shell $ MCE::shutdown: method cannot be called while running at /usr/share/perl5/site_perl/MCE/Signal.pm line 371.
END failed--call queue aborted at ./monitor1_test.pl line 371, <$tail_fh> line 1.

为什么其中一个子进程的终止会以这种方式影响父进程?我做错了什么或做出错误的假设,即父母应该通过这种做法?我现在有点烦恼所以任何建议都会非常感激。

平台:Linux 4.0.6 x86_64
Perl:5.22

1 个答案:

答案 0 :(得分:3)

模块将PID缓存在模块负载上。通过执行以下post-fork来修复它:

$MCE::Signal::main_proc_id = $$;

更好的是,延迟加载MCE直到fork之后。我会通过移动

来做到这一点
use MCE::Loop;
use MCE::Signal '-setpgrp';

进入一个模块(比如Worker.pm),并将子代码移动到同一模块中名为run的子代码中,然后执行以下post-fork:

require Worker;
Worker::run();

script

#!/usr/bin/perl

use strict;
use warnings;

use POSIX qw( setsid );

my $name = shift;

my @pids;
while (@pids < 2) {
    my $pid = fork();

    if (!defined $pid) {
        print "Can't fork: $!\n";
    }

    elsif ($pid == 0) {
        (setsid() != -1)
            or die "Can't start a new session: $!";

        require Worker;
        Worker::run($name);
        exit;
    }

    else {
        print $pid."\n";
        push(@pids, $pid);
    }
}

for my $pid (@pids) {
    waitpid($pid, 0);
}

Worker.pm

package Worker;

use strict;
use warnings;

use MCE::Loop;
use MCE::Signal '-setpgrp';

my $mce_maxWorkers = 2;
my $mce_chunkSize  = 1;

sub run {
    my $name = shift;

    MCE::Loop::init {
        max_workers => $mce_maxWorkers,
        chunk_size => $mce_chunkSize,
        on_post_exit => sub {
            my ($mce, $e) = @_;
            print "$e->{wid}: $e->{pid}: status $e->{status}: $e->{msg}\n";
        }
    };

    my $tail = 'tail -q -f '.$name;
    open my $tail_fh, "-|", $tail or die "Can't open tail\n";

    mce_loop_f {
        my ($mce, $chunk_ref, $chunk_id) = @_;
        my $line = ${$chunk_ref}[0];
        chomp($line);
        print $line."\n";

    } $tail_fh;
    close $tail_fh;

    MCE->shutdown;
}

1;