使用正则表达式Perl搜索数组

时间:2015-01-30 20:16:39

标签: arrays regex perl

my @array = ('Joe','Jim','Jim_BOB','Hello');
$search = "Joe";
$search2 = "Hello";
$search3 = "Jim";
$search4 =~ qw/.*?_.*?/;

my %index;
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};
my $index4 = $index{$search4};
print $index,",",$index2,",",$index3,",",$index4, "\n";

这将返回0,3,1,它们是@array中$ search项的索引。该索引将无法识别$ search4,但因为它是一个正则表达式。 我的问题是,如何用正则表达式搜索@array?

3 个答案:

答案 0 :(得分:2)

qw用于引用单词列表,将正则表达式存储在变量中,最好使用qr

my $search4 = qr/_/; # the leading and trailing '.*?' are redundant 

获取单个任意匹配索引:

my ($index4) = grep $array[$_] =~ /$search4/, 0..$#array; 

或者所有人:

my @i = grep $array[$_] =~ /$search4/, 0..$#array;

答案 1 :(得分:1)

如果您的数组包含重复元素,那么使用哈希的当前方法将仅返回 last 匹配索引。其他答案显示了如何修复现有代码,但为了允许重复元素,您可以使用List::MoreUtils

以下显示如何获取固定搜索字符串和正则表达式的第一个和最后一个匹配索引,以及如何获取所有匹配的索引:

use strict;
use warnings;
use 5.010;

use List::MoreUtils qw(first_index last_index indexes);

my @words = qw(Joe Jim Jim_BOB Hello Jim Hello Jim);

my $string = 'Jim';
my $regex = '^J';

say "First $string: " . first_index { $_ eq $string } @words;
say "Last $string: " . last_index { $_ eq $string } @words;
say "All $string: " . join ', ', indexes { $_ eq $string } @words;

say "First regex: " . first_index { /$regex/ } @words;
say "Last regex: " . last_index { /$regex/ } @words;
say "All regex: " . join ', ', indexes { /$regex/ } @words;

输出:

First Jim: 1
Last Jim: 6
All Jim: 1, 4, 6
First regex: 0
Last regex: 6
All regex: 0, 1, 2, 4, 6

答案 2 :(得分:0)

在您的代码中存在无关的问题,因为$search4不是reqex。 $search4 =~ qw/.*?_.*?/; 表示您将未定义变量$search4qw/.*?_.*?/;匹配。 qw基本上是在空格上拆分字符串。在这种情况下,没有空格,因此您匹配字符串.*?_.*?。在void上下文中,这根本没有任何影响,$search4未定义。

使用use strict; use warnings;并在声明变量后,您将收到相应的错误。

$ cat t1.pl 
use strict;
use warnings;

my @array = ('Joe','Jim','Jim_BOB','Hello');
my $search = "Joe";
my $search2 = "Hello";
my $search3 = "Jim";
my $search4 =~ qw/.*?_.*?/;

my %index;
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};
my $index4 = $index{$search4};
print $index,",",$index2,",",$index3,",",$index4, "\n";

$ perl t.pl 
Use of uninitialized value $search4 in pattern match (m//) at t.pl line 8.
Use of uninitialized value $search4 in hash element at t.pl line 15.
Use of uninitialized value $index4 in print at t.pl line 16.
0,3,1,

我认为你的意思是$search4 = qr/.*?_.*?/

问题的一个解决方案是将regexp视为一种特殊情况并循环遍历数组。

$ cat t2.pl 
use strict;
use warnings;

my @array = ('Joe','Jim','Jim_BOB','Hello');
my $search = 'Joe';
my $search2 = 'Hello';
my $search3 = 'Jim';
my $search4 = qr/.*?_.*?/;

my %index;
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};

# loop over the array until a match is found
my $cnt = 0;
my $index4;
for my $elem ( @array ) {
    if ( $elem =~ $search4 ) {
        $index4 = $cnt;
        last;
    }
    $cnt++;
}

print "$index,$index2,$index3,$index4\n";

$ perl t2.pl 
0,3,1,2

如果您想使用查找哈希,那么您可能希望CPAN中的模块Tie::Hash::Regex

$ cat t3.pl 
use strict;
use warnings;

# modules from CPAN
use Tie::Hash::Regex;

my @array = ('Joe','Jim','Jim_BOB','Hello');
my $search = "Joe";
my $search2 = "Hello";
my $search3 = "Jim";
my $search4 = qr/.*?_.*?/;

my %index;
tie %index, 'Tie::Hash::Regex';
@index{@array} = (0..$#array);
my $index = $index{$search};
my $index2 = $index{$search2};
my $index3 = $index{$search3};
my $index4 = $index{$search4};
print "$index,$index2,$index3,$index4\n";
bernhard@bernhard-Aspire-E1-572:~/devel/StackOverflow$ perl t3.pl 
0,3,1,2

请注意该解决方案存在一些缺点。如果多个键匹配,则不能保证您将获得哪个匹配键。如果你传递一个看起来不像正则表达式的字符串,它仍将被视为正则表达式。