我正在研究一个在Perl中实现的项目,并认为使用线程来分配工作是一个想法,因为任务可以彼此独立完成,只能从内存中的共享数据中读取。然而,表现远不如我预期的那样。所以在经过一些调查之后,我只能得出结论,Perl中的线程基本上很糟糕,但我一直想知道一旦实现一个共享变量,性能就会消耗殆尽。
例如,这个小程序没有任何共享,占用了75%的CPU(如预期的那样):
use threads;
sub fib {
my ( $n ) = @_;
if ( $n < 2 ) {
return $n;
} else {
return fib( $n - 1 ) + fib( $n - 2 );
}
}
my $thr1 = threads->create( 'fib', 35 );
my $thr2 = threads->create( 'fib', 35 );
my $thr3 = threads->create( 'fib', 35 );
$thr1->join;
$thr2->join;
$thr3->join;
一旦我引入共享变量$a
,CPU使用率就会介于40%到50%之间:
use threads;
use threads::shared;
my $a : shared;
$a = 1000;
sub fib {
my ( $n ) = @_;
if ( $n < 2 ) {
return $n;
} else {
return $a + fib( $n - 1 ) + fib( $n - 2 ); # <-- $a was added here
}
}
my $thr1 = threads->create( 'fib', 35 );
my $thr2 = threads->create( 'fib', 35 );
my $thr3 = threads->create( 'fib', 35 );
$thr1->join;
$thr2->join;
$thr3->join;
因此$a
是只读的,不会发生锁定,但性能会下降。我很好奇为什么会这样。
目前我在Windows XP上使用Cygwin下的Perl 5.10.1。不幸的是,我无法在非Windows机器上测试这个(希望)更新的Perl。
答案 0 :(得分:3)
您的代码是围绕同步结构的紧密循环。通过让每个线程将共享变量(每个线程只需一次)复制到非共享变量中来优化它。
答案 1 :(得分:0)
在Perl中构建包含大量数据的共享对象是可能的,而不用担心额外的副本。产生worker时对性能没有影响,因为共享数据驻留在单独的线程或进程中,具体取决于是否使用线程。
use MCE::Hobo; # use threads okay or parallel module of your choice
use MCE::Shared;
# The module option constructs the object under the shared-manager.
# There's no trace of data inside the main process. The construction
# returns a shared reference containing an id and class name.
my $data = MCE::Shared->share( { module => 'My::Data' } );
my $b;
sub fib {
my ( $n ) = @_;
if ( $n < 2 ) {
return $n;
} else {
return $b + fib( $n - 1 ) + fib( $n - 2 );
}
}
my @thrs;
push @thrs, MCE::Hobo->create( sub { $b = $data->get_keys(1000), fib(35) } );
push @thrs, MCE::Hobo->create( sub { $b = $data->get_keys(2000), fib(35) } );
push @thrs, MCE::Hobo->create( sub { $b = $data->get_keys(3000), fib(35) } );
$_->join() for @thrs;
exit;
# Populate $self with data. When shared, the data resides under the
# shared-manager thread (via threads->create) or process (via fork).
package My::Data;
sub new {
my $class = shift;
my %self;
%self = map { $_ => $_ } 1000 .. 5000;
bless \%self, $class;
}
# Add any getter methods to suit the application. Supporting multiple
# keys helps reduce the number of trips via IPC. Serialization is
# handled automatically if getter method were to return a hash ref.
# MCE::Shared will use Serial::{Encode,Decode} if available - faster.
sub get_keys {
my $self = shift;
if ( wantarray ) {
return map { $_ => $self->{$_} } @_;
} else {
return $self->{$_[0]};
}
}
1;