我正在使用MCE做一些事情并且它一直运作良好。我需要注意事件发生然后派一个MCE进程来处理该事件。这很好用,但是当我认为只有子MCE进程受到影响时,我遇到了一个问题,即子进程中的错误会导致父进程失效。这是一个演示此行为的简短程序。
#!/usr/bin/perl
use strict;
use warnings;
use MCE::Loop;
use MCE::Signal '-setpgrp';
use POSIX "setsid";
$SIG{CHLD} = 'IGNORE';
my $mce_maxWorkers = 2;
my $mce_chunkSize = 1;
my @pids;
my $i = 0;
my $name = shift;
while ($i < 2) {
my $pid = fork();
if (!defined $pid) {
print "Can't fork: $!\n";
}
elsif ($pid == 0) {
#setpgrp(0,0);
(setsid() != -1) || die "Can't start a new session: $!";
MCE::Loop::init {
max_workers => $mce_maxWorkers,
chunk_size => $mce_chunkSize,
on_post_exit => sub {
my ($mce, $e) = @_;
print "$e->{wid}: $e->{pid}: status $e->{status}: $e->{msg}\n";
}
};
my $tail = 'tail -q -f '.$name;
open my $tail_fh, "-|", $tail or die "Can't open tail\n";
mce_loop_f {
my ($mce, $chunk_ref, $chunk_id) = @_;
my $line = ${$chunk_ref}[0];
chomp($line);
print $line."\n";
} $tail_fh;
close $tail_fh;
MCE->shutdown;
exit;
}
else {
print $pid."\n";
$i++;
push(@pids,$pid);
}
}
foreach my $p (@pids) {
waitpid $p, 0;
}
运行时,此程序会分叉两个子进程,这些进程会尾随文件并使用具有两个工作进程的MCE循环读取其内容。这导致7个进程,1个父进程,2个MCE管理器和4个MCE工作程序(以及2个尾部进程)。
使用setsid,MCE管理器进程应与父进程分离。导致这些孩子死亡的任何事情都不应该影响父进程的正确性吗?
以下是ps -efj |的结果grep monitor
user1 29001 978 29001 978 0 11:41 pts/2 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1 29002 29001 29002 29002 0 11:41 ? 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1 29003 29001 29003 29003 0 11:41 ? 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1 29004 29002 29002 29002 0 11:41 ? 00:00:00 tail -q -f tmp/monitor1/test1.log
user1 29005 29003 29003 29003 0 11:41 ? 00:00:00 tail -q -f tmp/monitor1/test1.log
user1 29006 29002 29002 29002 0 11:41 ? 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1 29007 29002 29002 29002 0 11:41 ? 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1 29008 29003 29003 29003 0 11:41 ? 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
user1 29009 29003 29003 29003 0 11:41 ? 00:00:00 /usr/bin/perl ./monitor1_test.pl tmp/monitor1/test1.log
如果我要将SIGTERM发送到上面的29002,我希望这个过程与29004,29006和29007一起消失。我也希望过程29001和29003不受影响。
然而,我看到的是29001与29002一起死亡,而29003仍然存在。在终端上观察到以下错误。
shell $ ./monitor1_test.pl tmp/monitor1/test1.log
29002
29003
test1234
test1234
## monitor1_test.pl: caught signal (INT), exiting
Killed
shell $ MCE::shutdown: method cannot be called while running at /usr/share/perl5/site_perl/MCE/Signal.pm line 371.
END failed--call queue aborted at ./monitor1_test.pl line 371, <$tail_fh> line 1.
为什么其中一个子进程的终止会以这种方式影响父进程?我做错了什么或做出错误的假设,即父母应该通过这种做法?我现在有点烦恼所以任何建议都会非常感激。
平台:Linux 4.0.6 x86_64
Perl:5.22
答案 0 :(得分:3)
模块将PID缓存在模块负载上。通过执行以下post-fork来修复它:
$MCE::Signal::main_proc_id = $$;
更好的是,延迟加载MCE直到fork之后。我会通过移动
来做到这一点use MCE::Loop;
use MCE::Signal '-setpgrp';
进入一个模块(比如Worker.pm
),并将子代码移动到同一模块中名为run
的子代码中,然后执行以下post-fork:
require Worker;
Worker::run();
script
:
#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw( setsid );
my $name = shift;
my @pids;
while (@pids < 2) {
my $pid = fork();
if (!defined $pid) {
print "Can't fork: $!\n";
}
elsif ($pid == 0) {
(setsid() != -1)
or die "Can't start a new session: $!";
require Worker;
Worker::run($name);
exit;
}
else {
print $pid."\n";
push(@pids, $pid);
}
}
for my $pid (@pids) {
waitpid($pid, 0);
}
Worker.pm
:
package Worker;
use strict;
use warnings;
use MCE::Loop;
use MCE::Signal '-setpgrp';
my $mce_maxWorkers = 2;
my $mce_chunkSize = 1;
sub run {
my $name = shift;
MCE::Loop::init {
max_workers => $mce_maxWorkers,
chunk_size => $mce_chunkSize,
on_post_exit => sub {
my ($mce, $e) = @_;
print "$e->{wid}: $e->{pid}: status $e->{status}: $e->{msg}\n";
}
};
my $tail = 'tail -q -f '.$name;
open my $tail_fh, "-|", $tail or die "Can't open tail\n";
mce_loop_f {
my ($mce, $chunk_ref, $chunk_id) = @_;
my $line = ${$chunk_ref}[0];
chomp($line);
print $line."\n";
} $tail_fh;
close $tail_fh;
MCE->shutdown;
}
1;