perl regex:坚持使用正则表达式捕获

时间:2013-06-04 10:19:36

标签: regex perl loops for-loop

我正在尝试从日志中提取进程所花费的时间。

例如,日志包含(相关行):

Time for search copy=15 s.
Time for content copy=45 s.
Time for unzip reply=20 s.

上面的行与日志中的许多其他行相对,这些行不是必需的。有几种生成此类日志的作业(日志名称为process.out),因此我们将Job_name作为每个作业的标识符。 我使用哈希来读取特定作业的日志。这是代码:

#!/usr/bin/perl

use strict;
use warnings;
use File::Basename;

my %Log_name = ();
my %File_name = ();
my %filetoread = ();
my %filetoreadStrings = ();
my @ftr      = ();
my @reply    = ();
my @content  = ();
my @search   = ();
my %Reply    = ();
my %Search   = ();
my %Content = ();
my $curr_dir=`pwd`;
chop($curr_dir);

my $Log_name = "ABC-DEF";
my $File_name=<$curr_dir/p*.out>;
my $filetoread = basename ($File_name);
my $filetoreadStrings=`strings $filetoread  | egrep "(Time for)"`;
@ftr = split('\n', $filetoreadStrings);
chomp (@ftr);


for (my $count = 0; $count < 6; $count++)    #The lines are repeated 6 times except for the "search copy" line which is repeated twice
{
 $reply[$count] = (grep /Time for unzip reply/, @ftr)[$count];
 $content[$count] = (grep /Time for content copy/, @ftr)[$count];
 $search[$count] = (grep /Time for search copy/, @ftr)[$count];
 if (defined $reply[$count]) 
 {
 ($Reply{$Log_name})  = $reply[$count] =~ /Time for unzip reply=(\d+) s./;

 printf "$Reply{$Log_name}\n";
 }
  if (defined $content[$count]) {
 ($Content{$Log_name})=$content[$count]=~/Time for content copy=(\d+) s./;

 printf "$Content{$Log_name}\n";
  }
  if (defined $search[$count]) {
   ($Search{$Log_name})  = $search[$count] =~ /Time for search copy=(\d+) s./;

   printf "$Search{$Log_name}\n";
  }

 }

上述代码的输出是:

Use of uninitialized value in concatenation (.) or string at new_try_loop.pl line 46.

上面的输出对应于每个printf语句。我实际上需要将这些时间值加起来计算总时间,这个我没有在代码中显示,因为重要的是首先得到“时间”。

这里需要做什么?如果需要任何其他信息,请与我们联系。

最初,我没有使用for循环,这段代码正常运行。如,

$reply1 = (grep /Time for unzip reply/, @ftr)[0];
$Reply1{$Log_name})  = $reply1 =~ /Time for unzip reply=(\d+) s./;
$reply2 = (grep /Time for unzip reply/, @ftr)[1];
$Reply2{$Log_name})  = $reply1 =~ /Time for unzip reply=(\d+) s./;
$reply3 = (grep /Time for unzip reply/, @ftr)[2];
$Reply3{$Log_name})  = $reply1 =~ /Time for unzip reply=(\d+) s./;
.......... and so on

以类似的方式将值存储在$ Content {$ Log_name}和$ Search {$ Log_name}中。我在这些变量中捕获了正则表达式,然后将它们添加起来。我正在使用for循环来优化它。

1 个答案:

答案 0 :(得分:0)

这样的部分

if (defined $reply[$count]) 
 {
 ($Reply{$Log_name})  = $reply[$count] =~ /Time for unzip reply=(\d+) s./;

 printf "$Reply{$Log_name}\n";
 }

成为

if (defined $reply[$count] && ($reply[$count] =~ /Time for unzip reply=(\d+) s./) ) 
 {
 ($Reply{$Log_name})  = $1
 print "$1\n";
 }

我假设您的部分数据与/Time for unzip reply/相匹配,但不是/Time for unzip reply=(\d+) s./