Perl - 创建一个新数组并将其传递给子例程线程

时间:2013-05-01 21:05:41

标签: arrays multithreading perl

我正在尝试创建一个具有一定长度的新多维数组(5,用于测试),用值填充它,然后将该数组传递给在与主代码分开的线程中运行的子程序,以便主代码可以继续为下一个要填充的值创建一个新数组。这个循环必须无休止地继续下去。

我看到数组被传递了,我可以看到$set[0]的值,但似乎数组被覆盖了什么。我不确定这里发生了什么。而且超级连接对象没有正确传递:我必须在每个线程中创建一个新的连接对象。我在这里错过了什么?

#!/usr/bin/perl -w -I /opt/hypertable/0.9.7.3/lib/perl -I /opt/hypertable/0.9.7.3/lib/perl/gen-perl
use strict;
use IO::Socket;
use Geo::IP;
use threads qw(stringify);
use Net::NBName;
use Data::Dumper;
use Hypertable::ThriftClient;

my $hypertable = new Hypertable::ThriftClient("Server", 38080);
my $namespace  = $hypertable->namespace_open("TEST");
my $MAXLEN     = 1524;
my $buf        = '';

my $limit = 5;    #length of array
my $sock  = IO::Socket::INET->new(LocalPort => '514', Proto => 'udp') || die("Socket: $@");

do {
    my $count = 0;
    my @set;

    for ($count = 0; $count <= $limit; $count++) {
        $sock->recv($buf, $MAXLEN);
        my ($port, $ipaddr) = sockaddr_in($sock->peername);
        my $hn = gethostbyaddr($ipaddr, AF_INET);
        $buf =~ /<(\d+)>(.*?):(.*)/;
        my $msg = $3;
        $set[$count][0] = $hn;
        $set[$count][1] = $msg;
        print $count. " --> "
            . $set[$count][0] . " --> "
            . $set[$count][1]
            . "\n";    #Multi dimensional array

    }

    my $thr = threads->create('logsys', @set, $hypertable);

} while (1);


sub logsys {

    my $count = 0;

    for ($count = 0; $count <= $limit; $count++) {
        my $hypertable = shift; # Here I want to use the single NoSQL db connector object for all threads
        my @set = shift;

        print $count. " --> "
            . @set->[$count][0] . " --> "
            . @set->[$count][1]
            . "\n";    # Here I expect the same exact array elements

        #DO SOME MORE STUFF here
    }
}

编辑:要在线程中运行或没有线程的简单代码。在线程中运行时,处理不会处理数组中的所有元素。但是当没有线程运行时,它会处理所有元素。

#!/usr/bin/perl -w -I /opt/hypertable/0.9.7.3/lib/perl -I /opt/hypertable/0.9.7.3/lib/perl/gen-perl
use strict;
use IO::Socket;
use Geo::IP;
use threads qw(stringify);
use Net::NBName;
use Data::Dumper;
use Hypertable::ThriftClient;

# Syslog Variables and Constants
my $MAXLEN = 1524;
my $limit = 5; #for testing
my $sock;
# Start Listening on UDP port 514
$sock = IO::Socket::INET->new(LocalPort => '514', Proto => 'udp') || die("Socket: $@");

my $buf = '';
  my $count = 0;
  my @set;

  for ($count = 0; $count <= $limit; $count++) {
  $sock->recv($buf, $MAXLEN);
  my ($port, $ipaddr) = sockaddr_in($sock->peername);
  my $hn = gethostbyaddr($ipaddr, AF_INET);
  $buf=~/<(\d+)>(.*?):(.*)/;
  my $msg=$3;
  $set[$count][0] = $hn;
  $set[$count][1] = $msg;
print $count." --> ".$set[$count][0]." --> ".$set[$count][1]."\n";#Print original array, should print 5 elements 

  my $thr = threads->create('logsys',@set);

#&logsys(@set);

sub logsys {
my $count = 0;
my @set= @_;

print "--------------------- ".scalar (@set)." -------------------\n";

for ($count=0; $count <= $limit; $count++) {
print $count." --> ".$set[$count][0]." --> ".$set[$count][1]."\n";#print passed array, should same exact 5 elements
if (open(WW,">syslog")){print WW $count." --> ".$set[$count][0]." --> ".$set[$count][1]."\n"; close(WW);}

}
}

作为主题运行时的O / P:

0 --> ids-01p --> 23:48 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.10.97.42:3065 -> 33.87.66.38:80
1 --> ids-01p --> 23:50 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.10.1.254:26616 -> 78.67.61.202:80
2 --> ids-01p --> 23:50 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.10.1.254:39180 -> 56.164.27.51:80
3 --> ids-01p --> 23:51 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.10.52.97:53967 -> 173.194.37.97:80
4 --> ids-01p --> 23:51 IDS01 SFIMS: [FLIDS][Enterprise][119:15:1] http_inspect: OVERSIZE REQUEST-URI DIRECTORY [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} 10.190.1.254:57265 -> 34.44.17.21:80
5 --> ids-01p --> 23:51 IDS01 SFIMS: [FLIDS][Enterprise][119:15:1] http_inspect: OVERSIZE REQUEST-URI DIRECTORY [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} 10.190.1.254:41960 -> 34.44.17.29:80
--------------------- 6 -------------------
0 --> ids-01p --> 23:48 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.190.97.42:3065 -> 43.87.66.38:80
Perl exited with active threads:
        1 running and unjoined
        0 finished and unjoined
        0 running and detached
1 --> ids-01p --> 23:50 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.190.1.254:26616 -> 43.67.61.202:80

在没有线程的情况下运行时的O / P:

0 --> ids-01p --> 36:48 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.10.1.254:34053 -> 69.164.26.77:80
1 --> ids-01p --> 36:50 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.1.65.51:57977 -> 216.137.41.5:80
2 --> ids-01p --> 36:53 IDS01 SFIMS: [FLIDS][Enterprise][128:4:1] ssh: Protocol mismatch [Classification: Detection of a Non-Standard Protocol or Event] [Priority: 2] {TCP} 10.10.241.46:11120 -> 10.10.125.227:22
3 --> ids-01p --> 36:54 IDS01 SFIMS: [FLIDS][Enterprise][128:4:1] ssh: Protocol mismatch [Classification: Detection of a Non-Standard Protocol or Event] [Priority: 2] {TCP} 10.10.241.46:11122 -> 10.1.125.225:22
4 --> ids-01p --> 36:54 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.1.118.96:61686 -> 50.19.254.195:80
5 --> ids-01p --> 36:54 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.1.1.254:29437 -> 184.73.178.248:80
--------------------- 7 -------------------
0 --> ids-01p --> 36:48 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.10.1.254:34053 -> 69.164.26.77:80
1 --> ids-01p --> 36:50 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.1.65.51:57977 -> 216.137.41.5:80
2 --> ids-01p --> 36:53 IDS01 SFIMS: [FLIDS][Enterprise][128:4:1] ssh: Protocol mismatch [Classification: Detection of a Non-Standard Protocol or Event] [Priority: 2] {TCP} 10.10.241.46:11120 -> 10.10.125.227:22
3 --> ids-01p --> 36:54 IDS01 SFIMS: [FLIDS][Enterprise][128:4:1] ssh: Protocol mismatch [Classification: Detection of a Non-Standard Protocol or Event] [Priority: 2] {TCP} 10.10.241.46:11122 -> 10.1.125.225:22
4 --> ids-01p --> 36:54 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.1.118.96:61686 -> 50.19.254.195:80
5 --> ids-01p --> 36:54 IDS01 SFIMS: [FLIDS][Enterprise][138:4:1] sensitive_data: sensitive data - U.S. social security numbers without dashes [Classification: Sensitive Data] [Priority: 2] {TCP} 10.1.1.254:29437 -> 184.73.178.248:80

2 个答案:

答案 0 :(得分:2)

首先,您使用参数(..., @set, $hypertable)调用线程构造函数,但是在您分配shift之前,您使用$hypertable@set分配值的线程。您将要么更改参数的顺序,如

threads->create('logsys',$hypertable,@set);

或使用pop@_的末尾删除参数,而不是shift(从头开始删除):

my $hypertable = pop;   # same as  pop(@_)

其次,像@set = shift这样的作业几乎总是错误的。将shift的返回值(单个标量值)指定给列表是很常见的。在线程中分配$hypertable并从@_中删除其值后,@_中保留的所有内容都是您提供的@set,因此您可以说

my @set = @_;

@set->[...]在Perl中也没有意义。要访问线程中2d-array @set的元素,可以使用与创建数组相同的表示法:

print $count." --> ".$set[$count][0]." --> ".$set[$count][1]."\n";

答案 1 :(得分:0)

人们抱怨线程很慢,但只有当它们被不正确地使用时,例如在产生大量线程而不是使用工作线程时。

以下是您在代码中解决此问题的方法:

use threads;
use Thread::Queue::Any 1.03 qw( );

sub logsys {                      # Gets a reference, so uses
   my ($hypertable, $set) = @_;   #    @$set and $set->[...]  
   ... $set->[...][...] ...       # instead of
}                                 #    @set and $set[...]

my $db_queue = Thread::Queue::Any->new();

my $db_thread = async { 
   my $hypertable = ...;    # Only executed once.
   while (my $set = $db_queue->dequeue()) {
      logsys($hypertable, $set);
   }
};

... $db_queue->enqueue(\@set); ...

$db_queue->enqueue(undef);  # Signal that we're done.
$db_thread->join();         # Wait for db to be done.

通过只有一个数据库线程,它还解决了为所有线程建立单个数据库连接的不可能的愿望。