Question

我有一个大型数组，大约10000个数据，并希望使用parallel :: forkmanager

进行处理

如何在1000的chunck上处理数据我有以下代码：

my $MAX_PROCESSES = 10;

 my  $pm = new Parallel::ForkManager($MAX_PROCESSES);

    for (<>) {
       my $pid = $pm->start and next; 
      #  here i want to process my data on chunks of 1000 with 10 parallel::formanagaer
       $pm->finish; 
         }

如何自定义我的代码呢？

Answer 1

正如文档所示，数据从父级传递给子级，因此您需要采用以下形式的内容：

for (;;) {
   ... get a chunk ...
   my $pid = $pm->start and next; 
   ... process chunk ...
   $pm->finish; 
}

所以

use constant CHUNK_SIZE => 1000;

CHUNK:    
for (my $eof = 0; !$eof;) {
   my @chunk;
   while (@chunk < CHUNK_SIZE) {
      my $line = <>;
      if (!$line) {
         if (@chunk) {
            $eof = 1;  # Can't rely on a handle returning EOF twice.
            last;      #   so we have to make a note of it.
         } else {
            last CHUNK;
         }
      }

      push @chunk, $line;
   }

   my $pid = $pm->start and next; 
   ... process chunk ...
   $pm->finish; 
}

Answer 2

my $max_procs = 4;
my $pm = new Parallel::ForkManager($max_procs);

foreach my $index ( 0 .. $max_procs-1 ) {
    my $cmd = $$cli_to_execute[$index];

    # Forks and returns the pid for the child:
    my $pid = $pm->start($index) and next;

    my $out = &cli_command( $cmd ) unless $pid; ### This code is the child process

    $pm->finish($index, \$out); # pass an exit code to finish
}

print "Waiting for children...\n";
$pm->wait_all_children;

parallel :: forkmanager perl模块

2 个答案: