Perl正则表达式:替换除模式

时间:2016-08-13 13:19:25

标签: regex perl substitution

在perl中,我想用一个否定的类字符集(除了模式之外的所有东西)替换,只保留预期的字符串。通常情况下,这种方法应该有效,但就我而言,它不是:

$var =~ s/[^PATTERN]//g;

原始字符串:

$string = '<iframe src="https://foo.bar/embed/b74ed855-63c9-4795-b5d5-c79dd413d613?autoplay=1&context=cGF0aD0yMSwx</iframe>'; 

希望得到的模式:b74ed855-63c9-4795-b5d5-c79dd413d613

(5个十六进制数字组用4个破折号分割)

我的代码:

$pattern2keep = "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}";  

(应仅匹配:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx(5个十六进制数字组用4个破折号分割),字符长度:8-4-4-4-12)

以下内容应该替代除了模式之外的所有内容,但事实上并非如此。

$string =~ s/[^$pattern2keep]//g;

我做错了什么?感谢。

1 个答案:

答案 0 :(得分:8)

A character class matches a single character equal to any one of the characters in the class. If the class begins with a caret then the class is negated, so it matches any one character that isn't any of the characters in the class

If $pattern2keep is [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12} then [^$pattern2keep] will match any character other than -, 0, 1, 2, 4, 8, 9, [, ], a, f, {, or }

You need to capture the substring, like this

use strict;
use warnings 'all';
use feature 'say';

my $string = '<iframe src="https://foo.bar/embed/b74ed855-63c9-4795-b5d5-c79dd413d613?autoplay=1&context=cGF0aD0yMSwx</iframe>';

my $pattern_to_keep = qr/ \p{hex}{8} (?: - \p{hex}{4} ){3} - \p{hex}{12} /x;

my $kept;

$kept = $1 if $string =~ /($pattern_to_keep)/;

say $kept // 'undef';

output

b74ed855-63c9-4795-b5d5-c79dd413d613