是否有一种线程安全的方式在Perl中打印?

时间:2014-04-14 15:24:29

标签: multithreading perl thread-safety stdout

我目前有一个脚本可以启动线程以在多个目录上执行各种操作。我的剧本片段是:

#main
sub BuildInit {

    my $actionStr = "";
    my $compStr   = "";

    my @component_dirs;
    my @compToBeBuilt;
    foreach my $comp (@compList) {
        @component_dirs = GetDirs($comp);    #populates @component_dirs
    }

    print "Printing Action List: @actionList\n";

    #---------------------------------------
    #----   Setup Worker Threads  ----------
    for ( 1 .. NUM_WORKERS ) {
        async {
            while ( defined( my $job = $q->dequeue() ) ) {
                worker($job);
            }
        };
    }

    #-----------------------------------
    #----   Enqueue The Work  ----------
    for my $action (@actionList) {
        my $sem = Thread::Semaphore->new(0);
        $q->enqueue( [ $_, $action, $sem ] ) for @component_dirs;

        $sem->down( scalar @component_dirs );
        print "\n------>> Waiting for prior actions to finish up... <<------\n";
    }

    # Nothing more to do - notify the Queue that we're not adding anything else
    $q->end();
    $_->join() for threads->list();

    return 0;
}

#worker
sub worker {
    my ($job) = @_;
    my ( $component, $action, $sem ) = @$job;
    Build( $component, $action );
    $sem->up();
}

#builder method
sub Build {

    my ( $comp, $action ) = @_;
    my $cmd     = "$MAKE $MAKE_INVOCATION_PATH/$comp ";
    my $retCode = -1;

    given ($action) {
        when ("depend") { $cmd .= "$action >nul 2>&1" }    #suppress output
        when ("clean")  { $cmd .= $action }
        when ("build")  { $cmd .= 'l1' }
        when ("link")   { $cmd .= '' }                     #add nothing; default is to link
        default { die "Action: $action is unknown to me." }
    }

    print "\n\t\t*** Performing Action: \'$cmd\' on $comp ***" if $verbose;

    if ( $action eq "link" ) {

        # hack around potential race conditions -- will only be an issue during linking
        my $tries = 1;
        until ( $retCode == 0 or $tries == 0 ) {
            last if ( $retCode = system($cmd) ) == 2;      #compile error; stop trying
            $tries--;
        }
    }
    else {
        $retCode = system($cmd);
    }
    push( @retCodes, ( $retCode >> 8 ) );

    #testing
    if ( $retCode != 0 ) {
        print "\n\t\t*** ERROR IN $comp: $@ !! ***\n";
        print "\t\t*** Action: $cmd -->> Error Level: " . ( $retCode >> 8 ) . "\n";

        #exit(-1);
    }

    return $retCode;
}

我希望线程安全的print语句是:print "\n\t\t*** Performing Action: \'$cmd\' on $comp ***" if $verbose;理想情况下,我希望有这个输出,然后每个组件都有{{1}在它上面执行,将以相关的块输出。但是,这显然现在不起作用 - 输出大部分都是交错的,每个线程都会自己输出信息。

E.g,:

$action

我考虑使用反引号执行系统命令,并捕获大字符串中的所有输出,然后在线程终止时立即输出所有输出。但问题是(a)它看起来效率极低,而且(b)我需要捕获ComponentAFile1.cpp ComponentAFile2.cpp ComponentAFile3.cpp ComponentBFile1.cpp ComponentCFile1.cpp ComponentBFile2.cpp ComponentCFile2.cpp ComponentCFile3.cpp ... etc.

任何人都可以看到一种方法来保持每个线程的输出分开吗?

澄清: 我想要的输出是:

stderr

3 个答案:

答案 0 :(得分:5)

为确保您的输出不被中断,对STDOUT和STDERR的访问必须是互斥的。这意味着在线程开始打印和完成打印之间,不允许其他线程打印。这可以使用Thread :: Semaphore [1]

完成

捕获输出并立即打印所有内容可以减少线程持有锁定的时间。如果你不这样做,你将有效地建立你的系统单线程系统,因为每个线程在一个线程运行时尝试锁定STDOUT和STDERR。

其他选项包括:

  1. 为每个帖子使用不同的输出文件。
  2. 将作业ID添加到每行输出,以便稍后对输出进行排序。
  3. 在这两种情况下,您只需要在很短的时间内锁定它。


    1. # Once
      my $mutex = Thread::Semaphore->new();  # Shared by all threads.
      
      
      # When you want to print.
      $mutex->down();
      print ...;
      STDOUT->flush();
      STDERR->flush();
      $mutex->up();
      

      # Once
      my $mutex = Thread::Semaphore->new();  # Shared by all threads.
      STDOUT->autoflush();
      STDERR->autoflush();
      
      
      # When you want to print.
      $mutex->down();
      print ...;
      $mutex->up();
      

答案 1 :(得分:2)

$sem->down中提到的那样,如果它尝试将信号量计数器降低到零以下,则可以利用down()的阻止行为:

  

如果my $sem = Thread::Semaphore->new( 1 ); 尝试将计数器减少到零以下,则会阻塞   直到柜台足够大。


所以这是人们可以做的事情:

使用在所有线程中共享的计数器1初始化信号量

worker

将线程计数器传递给Buildfor my $thr_counter ( 1 .. NUM_WORKERS ) { async { while ( defined( my $job = $q->dequeue() ) ) { worker( $job, $thr_counter ); } }; } sub worker { my ( $job, $counter ) = @_; Build( $component, $action, $counter ); }

->down

->up内(而不是其他地方)转到Buildsub Build { my ( $comp, $action, $counter ) = @_; ... # Execute all concurrently-executed code here $sem->down( 1 << ( $counter -1 ) ); print "\n\t\t*** Performing Action: \'$cmd\' on $comp ***" if $verbose; # Execute all sequential 'chunks' here $sem->up( 1 << ( $counter - 1) ); }

+-----------+---+---+---+---+
| Thread    | 1 | 2 | 3 | 4 |
+-----------+---+---+---+---+
| Semaphore | 1 | 2 | 4 | 8 |
+-----------+---+---+---+---+

通过使用线程计数器左移信号量计数器,它可以保证线程不会相互踩踏:

{{1}}

答案 2 :(得分:2)

过去,我通过创建IO线程并使用它来序列化文件访问来解决此问题的方式不同。

E.g。

my $output_q = Thread::Queue -> new();

sub writer {
    open ( my $output_fh, ">", $output_filename );
    while ( my $line = $output_q -> dequeue() ) {
        print {$output_fh} $line; 
    }
    close ( $output_fh );
 }

在线程中,'打印':

$output_q -> enqueue ( "text_to_print\n"; );

有或没有包装 - 例如用于时间戳语句,如果他们要去日志。 (您可能希望在排队时加时间戳,而不是在实际打印时)。