如何在非恒定位置获取值?

时间:2019-06-20 11:27:06

标签: regex perl

字符串包含:

#EXT-X-STREAM-INF:BANDWIDTH=1439890,RESOLUTION=640x480,CODECS="avc1.42001f,mp4a.40.2"

键值对的顺序不是恒定的。例子:

#EXT-X-STREAM-INF:BANDWIDTH=1439890,RESOLUTION=640x480,CODECS="avc1.42001f,mp4a.40.2"
#EXT-X-STREAM-INF:RESOLUTION=640x480,BANDWIDTH=1439890,CODECS="avc1.42001f,mp4a.40.2"
#EXT-X-STREAM-INF:CODECS="avc1.42001f,mp4a.40.2",RESOLUTION=640x480,BANDWIDTH=1439890

如何使用一个正则表达式在对的非恒定位置获取值? 此示例用于成对的永久位置:

my $str = '#EXT-X-STREAM-INF:BANDWIDTH=1439890,RESOLUTION=640x480,CODECS="avc1.42001f,mp4a.40.2"';
if ($str =~/#EXT-X-STREAM-INF:BANDWIDTH=(?'bandwidth'.+?),RESOLUTION=(?'resolution'.+?),CODECS="(?'codecs'.+?)"$/) {
    say $+{bandwidth};
    say $+{resolution};
    say $+{codecs};
}

2 个答案:

答案 0 :(得分:3)

使用积极的前瞻:

while(<DATA>) {
    print;
    if (/^#EXT-X-STREAM-INF:(?=.*BANDWIDTH=(?'bandwidth'[^,]+))(?=.*RESOLUTION=(?'resolution'[^,]+))(?=.*CODECS="(?'codecs'[^"]+))/) {
        say $+{bandwidth};
        say $+{resolution};
        say $+{codecs};
    } else {
        say 'NO match';
    }
}

__DATA__
#EXT-X-STREAM-INF:BANDWIDTH=1439890,RESOLUTION=640x480,CODECS="avc1.42001f,mp4a.40.2"
#EXT-X-STREAM-INF:RESOLUTION=640x480,BANDWIDTH=1439890,CODECS="avc1.42001f,mp4a.40.2"
#EXT-X-STREAM-INF:CODECS="avc1.42001f,mp4a.40.2",RESOLUTION=640x480,BANDWIDTH=1439890

输出:

#EXT-X-STREAM-INF:BANDWIDTH=1439890,RESOLUTION=640x480,CODECS="avc1.42001f,mp4a.40.2"
1439890
640x480
avc1.42001f,mp4a.40.2
#EXT-X-STREAM-INF:RESOLUTION=640x480,BANDWIDTH=1439890,CODECS="avc1.42001f,mp4a.40.2"
1439890
640x480
avc1.42001f,mp4a.40.2
#EXT-X-STREAM-INF:CODECS="avc1.42001f,mp4a.40.2",RESOLUTION=640x480,BANDWIDTH=14398901439890
640x480
avc1.42001f,mp4a.40.2

答案 1 :(得分:2)

最简单的方法可能是先将所有值提取到哈希中:

use v5.12.0;
use warnings;

my @variants = (
    '#EXT-X-STREAM-INF:BANDWIDTH=1439890,RESOLUTION=640x480,CODECS="avc1.42001f,mp4a.40.2"',
    '#EXT-X-STREAM-INF:RESOLUTION=640x480,BANDWIDTH=1439890,CODECS="avc1.42001f,mp4a.40.2"',
    '#EXT-X-STREAM-INF:CODECS="avc1.42001f,mp4a.40.2",RESOLUTION=640x480,BANDWIDTH=1439890',
);

for my $str (@variants) {
    say "trying $str ...";
    my %data = $str =~ /(\w+)=(?|([^",]+)|"([^"]*)")/g;

    say "bandwidth:  $data{BANDWIDTH}";
    say "resolution: $data{RESOLUTION}";
    say "codecs:     $data{CODECS}";
    say "";
}

Live demo

列表上下文中的

m//g尝试匹配尽可能多的次数,并返回所有捕获的字符串的列表。在这种情况下,每个匹配项都会捕获两次(由于分支重置模式(?|...),最后两个替代项计为一组)。

为哈希分配列表会将偶数位置的元素解释为键,将奇数元素解释为对应值。