Perl:在列表中搜索字符串,然后在其下面/之后搜索另一个字符串

时间:2014-04-16 01:33:49

标签: regex perl list

我有一个如下列表:

Policy Name:       PTCC-VNX7500-server_4A
Options:           0x0
template:          FALSE
Schedule:              MonthlyFull
  Type:                FULL (0)
  Calendar sched: Enabled
    Allowed to retry after run day
    Last day of month
  Maximum MPX:         1
  Synthetic:           0
  Retention Level:     11 (3 years)

我需要提取"时间表:"      (即时间表:MonthlyFull)

......然后"保留等级:"      即保留水平:11(3年) ......这个字符串("保留级别:")显示在单词" Schedule:"。

之下的任何地方

我想结束看起来像这样的事情:

PTCC-VNX7500-server_4A,MonthlyFull,11 (3 years)
PTCC-VNX7500-server_4A,WeeklyFull,8 (4 weeks)
PTCC-VNX7500-server_4A,7_Year,1 (7 years)

我试图在这里和Perlmonks找到解决方案但是没有成功。

谢谢!

3 个答案:

答案 0 :(得分:0)

这是一种方法:

use strict;
use warnings;

my @rec;

while(my $line=<DATA>) {
    if ($line =~ /Policy Name:|Schedule:|Retention Level:/) {
        chomp($line);
        my ($name, $value) = split /:\s*/, $line;
        push @rec, $value;
        if ($line =~ /Retention Level/) {
            local $"=",";
            print "@rec\n";
            @rec = ();
        }
    } 
}
__DATA__
Policy Name:       PTCC-VNX7500-server_4A
Options:           0x0
template:          FALSE
Schedule:              MonthlyFull
  Type:                FULL (0)
  Calendar sched: Enabled
    Allowed to retry after run day
    Last day of month
  Maximum MPX:         1
  Synthetic:           0
  Retention Level:     11 (3 years)
Policy Name:       PTCC-VNX7500-server_4A
Options:           0x0
template:          FALSE
Schedule:              WeeklyFull
  Type:                FULL (0)
  Calendar sched: Enabled
    Allowed to retry after run day
    Last day of month
  Maximum MPX:         1
  Synthetic:           0
  Retention Level:     8 (4 weeks)

输出:

PTCC-VNX7500-server_4A,MonthlyFull,11 (3 years)
PTCC-VNX7500-server_4A,WeeklyFull,8 (4 weeks)

答案 1 :(得分:0)

use strict;
use warnings;
use autodie;

my %record;
my $last_key;

while(<DATA>) {
    if (/^\s*(.*?):\s*(.*)/) {
        my ($k, $v) = ($1, $2);
        if ($k eq 'Policy Name' && %record) {
            print join(',', @record{('Policy Name', 'Schedule', 'Retention Level')}), "\n";
            %record = ();
        }
        $record{$k} = $v;
    }
}

print join(',', @record{('Policy Name', 'Schedule', 'Retention Level')}), "\n";

__DATA__
Policy Name:       PTCC-VNX7500-server_4A
Options:           0x0
template:          FALSE
Schedule:              MonthlyFull
  Type:                FULL (0)
  Calendar sched: Enabled
    Allowed to retry after run day
    Last day of month
  Maximum MPX:         1
  Synthetic:           0
  Retention Level:     11 (3 years)
Policy Name:       PTCC-VNX7500-server_123
Options:           0x0
template:          FALSE
Schedule:              SometimesEmpty
  Type:                FULL (0)
  Calendar sched: Enabled
    Allowed to retry after run day
    Last day of month
  Maximum MPX:         1
  Synthetic:           0
  Retention Level:     41 (8 years)
Policy Name:       PTCC-VNX7500-server_789
Options:           0x0
template:          FALSE
Schedule:              AlwaysBusy
  Type:                FULL (0)
  Calendar sched: Enabled
    Allowed to retry after run day
    Last day of month
  Maximum MPX:         1
  Synthetic:           0
  Retention Level:     17 (2 years)

输出:

PTCC-VNX7500-server_4A,MonthlyFull,11 (3 years)
PTCC-VNX7500-server_123,SometimesEmpty,41 (8 years)
PTCC-VNX7500-server_789,AlwaysBusy,17 (2 years)

答案 2 :(得分:0)

您没有指定此内容,但假设每个时间表都在新的策略名称下,您可以使用this regex

Policy Name:\s*([^\n]+).*?Schedule:\s*([^\n]+).*?Retention Level:\s*([^\n]+)

这将检查Policy Name:后跟0 +空白字符,然后捕获到新行的所有内容。接下来,它会延迟匹配任意数量的字符,直到Schedule:后跟0 +空白字符,然后捕获到新行的所有内容。最后,它(令人惊讶地?)懒惰地匹配任意数量的字符,直到Retention Level:后跟0 +空格字符,然后捕获直到新行。

如链接示例所示,这为您提供了3个包含策略名称,计划和保留级别的组。您需要全局修饰符(g)一次匹配多个策略,点匹配新行修饰符(s)以使.*与换行符匹配,并且可选不区分大小写的修饰符(i)。


如果一个策略名称下有多个计划,您可以使用this regex

(?:Policy Name:\s*([^\n]+).*?)?Schedule:\s*([^\n]+).*?Retention Level:\s*([^\n]+)

这非常相似,我们只将整个Policy Name:\s*([^\n]+).*?部分包装在非捕获组中并使其成为可选部分。这意味着需要匹配。因此,第一场比赛将有3个捕获组(1:策略,2:计划,3:保留),后续匹配可能只有2个捕获组(1: null ,2:schedule,3:保留)。然后,您将使用您选择的语言来确定匹配的策略名称(来自上一个匹配项)。