Perl如果条件处理似乎不起作用

时间:2015-09-07 17:10:29

标签: perl if-statement

现在待了一个星期。不能使if条件在perl中工作。不知道我哪里出错了。

#!/usr/bin/perl

use strict;
use warnings;

local $/ = undef;

my $data = <DATA>;

$data =~ s/(\s+)/ /gi;
$data =~ s/\[|\]//gi;

while ($data =~ /:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/ig) {
    if (($+{start} - $+{end}) == 0) {
        $data =~ s/:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/:$+{end} $+{needed} $+{start}/i;
        print "\nFull match: '" . "$&" . "'\n";
        print "\nStart: '" . "$+{start}" . "'\n";
        print "\nEnd: '" . "$+{end}" . "'\n";
        print "\nDiff: '" . ($+{start} - $+{end}) . "'\n";
    }

}

#print "$data\n\n";

__DATA__

lorem          [1170:1540]
ipsum          [1540:2040]
dolor          [2040:2350]
sit            [2350:2510]
amet           [2510:2670]
consectetur    [2670:3130]
adipiscing     [3130:3240]
elit           [3240:3470]
quisque        [3550:4070]
egestas        [4070:4290]
magna          [4290:4570]
sit            [4620:4650]
amet           [4780:5390]
molestie       [5480:6660]
imperdiet      [6660:6890]
- velit
lectus         [6920:6950]
egestas        [7130:7530]
enim           [7570:7830]
non            [7830:8160]
ornare         [8160:8260]
eros           [8260:8600]
- neque
non            [8600:8890]
risus          [9120:9450]
aenean         [9450:9570]
venenatis      [9570:9820]
- hendrerit
- urna
- nec
- bibendum
nunc           [11210:11380]
lobortis       [11380:11470]
in             [11470:11710]
in             [11780:11810]
facilisis      [12960:13340]
urna           [13340:13460]
in             [13460:13920]
neque          [14070:14630]
bibendum       [14630:14930]
lobortis       [14930:15250]
- maecenas
- efficitur
- fermentum
eros           [17060:17450]
malesuada      [17450:17760]
posuere        [17760:17810]
nisi           [18050:18080]
- tristique
- sit

如果我将if条件语句从上面更改为以下任何变体,它似乎仍然不起作用。

if ($+{start} == $+{end})
if (($+{start} - $+{end}) eq 0)
if ("$+{start}" eq "$+{end})")

所需的输出只是第二个匹配,即“:8600 - neque non 8600”

3 个答案:

答案 0 :(得分:4)

问题是没有全局/g修饰符的子句每次都会从字符串的开头搜索

  • 循环第一次找到值8690 ... 6920,并跳过因为值不匹配

  • 第二次发现8600 ... 8600,所以进行了替换。但它找到了模式的第一个,并将:6890 - velit lectus 6920更改为6890 lectus 6920

  • 第三次时间,全局搜索再次从头开始,因为字符串已被修改,所以它现在再次找到8600 ... 8600,并且执行了这个子句,这个用:8600 - neque non 8600

  • 替换8600 non 8600的时间

你选择了一种非常尴尬的方法。如果你解释了它想要的东西,我可以帮助你好一些,但你的替换是对文件内容的无意义;例如,他们正在从两个数字之间删除冒号:分隔符,因此无法确定将结果数字字符串分开的位置


我会写这个

没有必要提升回Perl来检查结束值和起始值是否相同;您可以使用反向引用再次匹配相同的数字。而且我认为你的命名捕获是令人困惑的事情而不是澄清,例如notneeded捕获甚至不需要捕获,更不用说被命名了

也不需要Perl while循环,因为简单的s///g会自行查找并替换所有出现的

use strict;
use warnings;

my $data = do {
    local $/;
    <DATA>;
};

$data =~ s/\s+/ /g;
$data =~ tr/[]//d;

my $re = qr/ : (\d+) \s+ -\s[\p{alnum}'-]+ \s+ ([\p{alnum}'-]+) \s+ \1 /x;

my $n = $data =~ s/$re/$1 $2 $1/g;

printf "%d %s made\n", $n, $n == 1 ? 'substitution' : 'substitutions';

输出

1 substitution made

答案 1 :(得分:2)

所以我添加了一些额外的调试代码,发现了一些有趣的东西:(第2行和第3行,以及下面的print语句):

while ($data =~ /:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/ig) {
    my $start1 = $+{start};
    my $end1 = $+{end};
    print("start 1 is $start1\nend 1 is $end1\n\n");
    if ($start1 == $end1) {
        $data =~ s/:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/:$+{end} $+{needed} $+{start}/i;
        print "\nFull match: '" . "$&" . "'\n";
        print "\nStart: '" . "$+{start}" . "'\n";
        print "\nEnd: '" . "$+{end}" . "'\n";
        print "\nDiff: '" . ($+{start} - $+{end}) . "'\n";

这是输出:

[myuser@myhost tmp]$ ./tmp.pl
start 1 is 6920
end 1 is 6890

start 1 is 8600
end 1 is 8600


Full match: ':6890 - velit lectus 6920'

Start: '6920'

End: '6890'

Diff: '30'
start 1 is 8600
end 1 is 8600


Full match: ':8600 - neque non 8600'

Start: '8600'

End: '8600'

Diff: '0'

正如您所看到的,它为第一组数据打印了两组start1和end1,其中第二组实际匹配,因为它也将匹配迭代与它结合在一起。这有意义吗?

你的第二个正则表达式错过了'g'......

$data =~ s/:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/:$+{end} $+{needed} $+{start}/ig;

似乎可以解决问题。

[myuser@myhost tmp]$ ./tmp.pl
start 1 is 6920
end 1 is 6890

start 1 is 8600
end 1 is 8600


Full match: ':8600 - neque non 8600'

Start: '8600'

End: '8600'

Diff: '0'

答案 2 :(得分:0)

以下是我修改过的代码,它可以正常运行:

#!/usr/bin/perl

use strict;
use warnings;

local $/ = undef;

my $data = <DATA>;

$data =~ s/(\s+)/ /gi;
$data =~ s/\[|\]//gi;

while ($data =~ /:(?<end>\d+?) (?<notneeded>- [a-z0-9\-']+?) (?<needed>[a-z0-9\-']+?) (?<start>\d+?)(?=:)/ig) {
    if ("$+{start}" eq "$+{end}") {
        print "\nFull match: '" . "$&" . "'\n";
        print "\nStart: '" . "$+{start}" . "'\n";
        print "\nEnd: '" . "$+{end}" . "'\n";
        print "\nDiff: '" . ($+{start} - $+{end}) . "'\n";
        my $start = $+{start};
        my $end = $+{end};
        my $needed = $+{needed};

        $data =~ s/$&/:$end $needed $start/i;
    }

}

print "$data\n\n";

__DATA__

lorem          [1170:1540]
ipsum          [1540:2040]
dolor          [2040:2350]
sit            [2350:2510]
amet           [2510:2670]
consectetur    [2670:3130]
adipiscing     [3130:3240]
elit           [3240:3470]
quisque        [3550:4070]
egestas        [4070:4290]
magna          [4290:4570]
sit            [4620:4650]
amet           [4780:5390]
molestie       [5480:6660]
imperdiet      [6660:6890]
- velit
lectus         [6920:6950]
egestas        [7130:7530]
enim           [7570:7830]
non            [7830:8160]
ornare         [8160:8260]
eros           [8260:8600]
- neque
non            [8600:8890]
risus          [9120:9450]
aenean         [9450:9570]
venenatis      [9570:9820]
- hendrerit
- urna
- nec
- bibendum
nunc           [11210:11380]
lobortis       [11380:11470]
in             [11470:11710]
in             [11780:11810]
facilisis      [12960:13340]
urna           [13340:13460]
in             [13460:13920]
neque          [14070:14630]
bibendum       [14630:14930]
lobortis       [14930:15250]
- maecenas
- efficitur
- fermentum
eros           [17060:17450]
malesuada      [17450:17760]
posuere        [17760:17810]
nisi           [18050:18080]
- tristique
- sit