获取一系列电话号码,并针对所述号码的每次出现搜索另一个阵列,并打印该匹配行和以下行

时间:2015-04-27 19:57:22

标签: regex perl

我有两个文本文件。我将每个文件导入数组。 numbers数组中的每个值都应搜索users数组以查找其匹配项。如果找到,则回显匹配行和前一行。

因此,如果数字数组中的第一个条目是1234,则搜索用户数组为1234.如果找到则打印该行和下一行。

numbers.txt看起来像:

1234567021
1234566792

用户文件看起来像:

1234567021@host.com User-Password == "secret"
           Framed-IP-Address = 000.000.000.000,

到目前为止我所拥有的:

use strict;

my $users_file = "users";
my $numbers_file = "numbers.txt";
my $phonenumber;
my $numbers;


#### Place phone number into an array ####
open (RESULTS, $numbers_file) or die "Unable to open file: $users_file\n$!";
my @numbers;
@numbers = <NUMBER_RESULTS>;
close(NUMBER_RESULTS);

#### Place users file contents into an array ####
open (RESULTS, $users_file) or die "Unable to open file: $users_file\n$!";
my @users_data;
@users_data = <RESULTS>;
close(RESULTS);


#### Search the array for the string ####
foreach $numbers(@users_data) {
    if (index($numbers,$phonenumber) ge 0) {
  my @list = grep /\b$numbers\b/, @users_data;
  chomp @list;
  print "$_\n" foreach @list;
    }
}


exit 1;

1 个答案:

答案 0 :(得分:0)

当perl具有内置哈希数据类型时,您正在重新创建对键的搜索,该类型将比滚动自己更好,更快地处理此问题。使用它会在读取数据时花费更多的工作,但这是值得的。

首先,让我们切换到open的现代版本,我们在文件句柄中使用词法范围的变量,并指定一种模式。

open (my $results, "<", $users_file) or die "Unable to open file: $users_file\n$!";

从那里,我们将一次读取文件的开放行并填充哈希值。

my (%users_data, $number, $number_line);
while(<$results>)
{
    chomp;
    if(defined($number))
    {
        $user_data{$number} = "$number_line\t$_\n";    #load the line after the number into the hash value.
        undef $number;
    }else
    {
        if(/^(\d+)\@/)     #match digits between the beginning of the line and the @ symbol.
        {
            $number = $1;    #save matched digits from $1.
            $number_line = $_;
    }
}

请注意,这是假设数据格式良好。如果有问题,您可以在else子句中测试正确的格式。

现在,对于输出,我们可以使用以下

for (@numbers)
{
    chomp;    #since we didn't remove newlines when populating @numbers
    if( defined($users_data{$_}) )
    {
        print $users_data{$_};
    }
}

修改

这是一个工作版本。注意use strictuse warnings有助于捕获一个变量已声明(RESULTS%users_file)而另一个变量稍后使用(NUMBER_RESULTS%user_file ),这就是为什么那些如此重要。此外,Data::Dumper用于打印出数组@numbers和散列%users_data的内容,以查看实际将哪些数据导入数据结构。

#!/usr/bin/env perl
use strict;
use warnings;
#use Data::Dumper;

my $users_file = "users";
my $numbers_file = "numbers.txt";

#### Place phone number into an array ####
open (my $results, "<", $numbers_file) or die "Unable to open file: $numbers_file\n$!";
my @numbers;
@numbers = <$results>;
close($results);
#print Dumper \@numbers;

open (my $results, "<", $users_file) or die "Unable to open file: $users_file\n$!";
my (%users_data, $number, $number_line);
while(<$results>)
{
    chomp;
    if(defined($number))
    {
        $users_data{$number} = $number_line."\n$_\n";    #load the line after the number into the hash value.
        undef $number;
    }else
    {
        if(/^(\d+)\@/)     #match digits between the beginning of the line and the @ symbol.
        {
            $number = $1;    #save matched digits from $1.
            $number_line = $_;
        }
    }
}
#print Dumper \%users_data;

for (@numbers)
{
    chomp;    #since we didn't remove newlines when populating @numbers
    if( defined($users_data{$_}) )
    {
        print $users_data{$_};
    }
}