我有一个递归函数,调用系统命令列出文件和目录。对于每个目录,它将再次调用自己。
此过程可能需要一段时间。这就是我想要运行并行作业的原因。
我正在研究ForkManager,但它不允许创建新的子叉。由于子流程的数量应限制在10,我想到的是一个“工人”。概念。有10名工人在等待工作被执行。
我的递归函数:
sub pullDataFromDbWithDirectory {
my $_dir = $_[0];
my @list = ();
if ($itemCount lt $maxNumberOfItems) {
my @retval = grep { /dir|file/ } map { s/^Dir\s+|^File\s+|\n//g; $_ } qx($omnidb -filesystem $filesystem '$label' -listdir '$_dir');
foreach my $item (@retval) {
$itemCount++;
push(@list,$item) if $item =~ /^file/;
if ($item =~ /^dir/) {
my $subdir = "$_dir/$item";
$data{$subdir} = ();
if ($recursive) {
pullDataFromDbWithDirectory($subdir);
}
}
}
$data{$_dir} = \@list;
}
}
非常感谢任何帮助。
更新
问题解决了。感谢您的投入。我修改了我的代码:
sub pullDataFromDbWithDirectory {
my $_dir = $_[0];
if ($itemCount <= $maxNumberOfItems) {
my @retval = grep { /dir|file/ } map { s/^Dir\s+|^File\s+|\n//g; $_ } qx($omnidb -filesystem $filesystem '$label' -listdir '$_dir');
foreach my $item (@retval) {
$itemCount++;
my $file = "$_dir/$item";
push(@data,$file);
if ($item =~ /^dir/) {
$worker->enqueue($file);
print "Add $file to queue\n" if $debug;
}
}
}
}
sub doOperation () {
my $ithread = threads->tid();
while (my $folder = $worker->dequeue()) {
print "Read $folder from queue\n" if $debug;
pullDataFromDbWithDirectory($folder);
}
}
my @threads = map threads->create(\&doOperation), 1 .. $maxNumberOfParallelJobs;
pullDataFromDbWithDirectory($directory);
$worker->enqueue((undef) x $maxNumberOfParallelJobs);
$_->join for @threads;
答案 0 :(得分:2)
我会重写你的代码以使用适当的Perl模块,比如File::Find它会更有效。
use File::Find;
my %data;
find(\&wanted, @directories_to_search);
sub wanted {
$data{$File::Find::dir} = $_;
}
对于paralel操作,我会像这样使用Thread :: Queue:
use strict;
use warnings;
use threads;
use threads;
use Thread::Queue;
my $q = Thread::Queue->new(); # A new empty queue
my %seen: shared;
# Worker thread
my @thrs = threads->create(\&doOperation ) for 1..5;#for 5 threads
add_file_to_q('/tmp/');
$q->enqueue('//_DONE_//') for @thrs;
$_->join() for @thrs;
sub add_file_to_q {
my $dir = shift;
my @files = `ls -1 $dir/`;chomp(@files);
#add files to queue
foreach my $f (@files){
# Send work to the thread
$q->enqueue($f);
print "Pending items: "$q->pending()."\n";
}
}
sub doOperation () {
my $ithread = threads->tid() ;
while (my $filename = $q->dequeue()) {
# Do work on $item
sleep(1) if ! defined $filename;
return 1 if $filename eq '//_DONE_//';
next if $seen{$filename};
print "[id=$ithread]\t$filename\n";
$seen{$filename} = 1;
### add files if it is a directory (check with symlinks, no file with //_DONE_// name!)
add_file_to_q($filename) if -d $filename;
}
return 1;
}