现在待了一个星期。不能使if条件在perl中工作。不知道我哪里出错了。
#!/usr/bin/perl
use strict;
use warnings;
local $/ = undef;
my $data = <DATA>;
$data =~ s/(\s+)/ /gi;
$data =~ s/\[|\]//gi;
while ($data =~ /:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/ig) {
if (($+{start} - $+{end}) == 0) {
$data =~ s/:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/:$+{end} $+{needed} $+{start}/i;
print "\nFull match: '" . "$&" . "'\n";
print "\nStart: '" . "$+{start}" . "'\n";
print "\nEnd: '" . "$+{end}" . "'\n";
print "\nDiff: '" . ($+{start} - $+{end}) . "'\n";
}
}
#print "$data\n\n";
__DATA__
lorem [1170:1540]
ipsum [1540:2040]
dolor [2040:2350]
sit [2350:2510]
amet [2510:2670]
consectetur [2670:3130]
adipiscing [3130:3240]
elit [3240:3470]
quisque [3550:4070]
egestas [4070:4290]
magna [4290:4570]
sit [4620:4650]
amet [4780:5390]
molestie [5480:6660]
imperdiet [6660:6890]
- velit
lectus [6920:6950]
egestas [7130:7530]
enim [7570:7830]
non [7830:8160]
ornare [8160:8260]
eros [8260:8600]
- neque
non [8600:8890]
risus [9120:9450]
aenean [9450:9570]
venenatis [9570:9820]
- hendrerit
- urna
- nec
- bibendum
nunc [11210:11380]
lobortis [11380:11470]
in [11470:11710]
in [11780:11810]
facilisis [12960:13340]
urna [13340:13460]
in [13460:13920]
neque [14070:14630]
bibendum [14630:14930]
lobortis [14930:15250]
- maecenas
- efficitur
- fermentum
eros [17060:17450]
malesuada [17450:17760]
posuere [17760:17810]
nisi [18050:18080]
- tristique
- sit
如果我将if条件语句从上面更改为以下任何变体,它似乎仍然不起作用。
if ($+{start} == $+{end})
if (($+{start} - $+{end}) eq 0)
if ("$+{start}" eq "$+{end})")
所需的输出只是第二个匹配,即“:8600 - neque non 8600”
答案 0 :(得分:4)
问题是没有全局/g
修饰符的子句每次都会从字符串的开头搜索
循环第一次找到值8690 ... 6920,并跳过因为值不匹配
第二次发现8600 ... 8600,所以进行了替换。但它找到了模式的第一个,并将:6890 - velit lectus 6920
更改为6890 lectus 6920
第三次时间,全局搜索再次从头开始,因为字符串已被修改,所以它现在再次找到8600 ... 8600,并且执行了这个子句,这个用:8600 - neque non 8600
8600 non 8600
的时间
你选择了一种非常尴尬的方法。如果你解释了它想要的东西,我可以帮助你好一些,但你的替换是对文件内容的无意义;例如,他们正在从两个数字之间删除冒号:
分隔符,因此无法确定将结果数字字符串分开的位置
我会写这个
没有必要提升回Perl来检查结束值和起始值是否相同;您可以使用反向引用再次匹配相同的数字。而且我认为你的命名捕获是令人困惑的事情而不是澄清,例如notneeded
捕获甚至不需要捕获,更不用说被命名了
也不需要Perl while
循环,因为简单的s///g
会自行查找并替换所有出现的
use strict;
use warnings;
my $data = do {
local $/;
<DATA>;
};
$data =~ s/\s+/ /g;
$data =~ tr/[]//d;
my $re = qr/ : (\d+) \s+ -\s[\p{alnum}'-]+ \s+ ([\p{alnum}'-]+) \s+ \1 /x;
my $n = $data =~ s/$re/$1 $2 $1/g;
printf "%d %s made\n", $n, $n == 1 ? 'substitution' : 'substitutions';
1 substitution made
答案 1 :(得分:2)
所以我添加了一些额外的调试代码,发现了一些有趣的东西:(第2行和第3行,以及下面的print语句):
while ($data =~ /:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/ig) {
my $start1 = $+{start};
my $end1 = $+{end};
print("start 1 is $start1\nend 1 is $end1\n\n");
if ($start1 == $end1) {
$data =~ s/:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/:$+{end} $+{needed} $+{start}/i;
print "\nFull match: '" . "$&" . "'\n";
print "\nStart: '" . "$+{start}" . "'\n";
print "\nEnd: '" . "$+{end}" . "'\n";
print "\nDiff: '" . ($+{start} - $+{end}) . "'\n";
这是输出:
[myuser@myhost tmp]$ ./tmp.pl
start 1 is 6920
end 1 is 6890
start 1 is 8600
end 1 is 8600
Full match: ':6890 - velit lectus 6920'
Start: '6920'
End: '6890'
Diff: '30'
start 1 is 8600
end 1 is 8600
Full match: ':8600 - neque non 8600'
Start: '8600'
End: '8600'
Diff: '0'
正如您所看到的,它为第一组数据打印了两组start1和end1,其中第二组实际匹配,因为它也将匹配迭代与它结合在一起。这有意义吗?
你的第二个正则表达式错过了'g'......
$data =~ s/:(?<end>\d+) (?<notneeded>- [a-z0-9\-']+) (?<needed>[a-z0-9\-']+?) (?<start>\d+)(?=:)/:$+{end} $+{needed} $+{start}/ig;
似乎可以解决问题。
[myuser@myhost tmp]$ ./tmp.pl
start 1 is 6920
end 1 is 6890
start 1 is 8600
end 1 is 8600
Full match: ':8600 - neque non 8600'
Start: '8600'
End: '8600'
Diff: '0'
答案 2 :(得分:0)
以下是我修改过的代码,它可以正常运行:
#!/usr/bin/perl
use strict;
use warnings;
local $/ = undef;
my $data = <DATA>;
$data =~ s/(\s+)/ /gi;
$data =~ s/\[|\]//gi;
while ($data =~ /:(?<end>\d+?) (?<notneeded>- [a-z0-9\-']+?) (?<needed>[a-z0-9\-']+?) (?<start>\d+?)(?=:)/ig) {
if ("$+{start}" eq "$+{end}") {
print "\nFull match: '" . "$&" . "'\n";
print "\nStart: '" . "$+{start}" . "'\n";
print "\nEnd: '" . "$+{end}" . "'\n";
print "\nDiff: '" . ($+{start} - $+{end}) . "'\n";
my $start = $+{start};
my $end = $+{end};
my $needed = $+{needed};
$data =~ s/$&/:$end $needed $start/i;
}
}
print "$data\n\n";
__DATA__
lorem [1170:1540]
ipsum [1540:2040]
dolor [2040:2350]
sit [2350:2510]
amet [2510:2670]
consectetur [2670:3130]
adipiscing [3130:3240]
elit [3240:3470]
quisque [3550:4070]
egestas [4070:4290]
magna [4290:4570]
sit [4620:4650]
amet [4780:5390]
molestie [5480:6660]
imperdiet [6660:6890]
- velit
lectus [6920:6950]
egestas [7130:7530]
enim [7570:7830]
non [7830:8160]
ornare [8160:8260]
eros [8260:8600]
- neque
non [8600:8890]
risus [9120:9450]
aenean [9450:9570]
venenatis [9570:9820]
- hendrerit
- urna
- nec
- bibendum
nunc [11210:11380]
lobortis [11380:11470]
in [11470:11710]
in [11780:11810]
facilisis [12960:13340]
urna [13340:13460]
in [13460:13920]
neque [14070:14630]
bibendum [14630:14930]
lobortis [14930:15250]
- maecenas
- efficitur
- fermentum
eros [17060:17450]
malesuada [17450:17760]
posuere [17760:17810]
nisi [18050:18080]
- tristique
- sit