无法分叉超过200个进程,有时甚至更少,具体取决于内存,CPU使用情况

时间:2016-12-20 22:47:13

标签: multithreading perl fork multitasking

这是使用Parallel :: ForkManager的程序的胆量。它似乎停止在200个proccesses,有时大约30,这取决于收集要发送到Mojo :: UserAgent的URL的pgsql查询的大小。某处似乎有一些硬限制?有没有更好的方法来写这个,以便我不会遇到这些限制?它运行的机器有16个CPU和128GB内存,因此它可以运行超过200个在Mojo :: UserAgent超时后死亡的过程,通常为2秒。

use Parallel::ForkManager;
use Mojo::Base-strict;
use Mojo::UserAgent;
use Mojo::Pg;
use Math::Random::Secure qw(rand irand);
use POSIX qw(strftime);
use Socket;
use GeoIP2::Database::Reader;
use File::Spec::Functions qw(:ALL);
use File::Basename qw(dirname);

use feature 'say';


$max_kids = 500;
sub do_auth {
...
        push( @url, $authurl );
}


do_auth();

my $pm = Parallel::ForkManager->new($max_kids);

LINKS:
foreach my $linkarray (@url) {
    $pm->start and next LINKS;    # do the fork
    my $ua = Mojo::UserAgent->new( max_redirects => 5, timeout => $timeout );
    $ua->get($url);
    $pm->finish;
}

$pm->wait_all_children;

2 个答案:

答案 0 :(得分:0)

对于您的示例代码(获取URL),我永远不会使用Forkmanager。我会使用Mojo :: IOLoop :: Delay或非阻塞调用样式。

use Mojo::UserAgent;
use feature 'say';

my $ua = Mojo::UserAgent->new;

$ua->inactivity_timeout(15);
$ua->connect_timeout(15);
$ua->request_timeout(15);
$ua->max_connections(0);

my @url = ("http://stackoverflow.com/questions/41253272/joining-a-view-and-a-table-in-mysql",
           "http://stackoverflow.com/questions/41252594/develop-my-own-website-builder",
           "http://stackoverflow.com/questions/41251919/chef-mysql-server-configuration",
           "http://stackoverflow.com/questions/41251689/sql-trigger-update-error",
           "http://stackoverflow.com/questions/41251369/entity-framework-how-to-add-complex-objects-to-db",
           "http://stackoverflow.com/questions/41250730/multi-dimensional-array-from-matching-mysql-columns",
           "http://stackoverflow.com/questions/41250528/search-against-property-in-json-object-using-mysql-5-6",
           "http://stackoverflow.com/questions/41249593/laravel-time-difference",
           "http://stackoverflow.com/questions/41249364/variable-not-work-in-where-clause-php-joomla");

foreach my $linkarray (@url) {
    # Run all requests at the same time
    $ua->get($linkarray => sub {
    my ($ua, $tx) = @_;
    say $tx->res->dom->at('title')->text;
   });
}
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;

答案 1 :(得分:-1)

您很可能在线程或进程上遇到操作系统限制。解决此问题的快速而肮脏的方法是增加限制,这通常是可配置的。也就是说,重写代码不使用如此多的短期线程是一种更具可扩展性的解决方案。