模式匹配 - 前瞻和后瞻

时间:2016-05-20 17:52:34

标签: perl

use strict;
use warnings;
use XML::Twig;
my @discard = qw / abc de bond/;
my $filter = join '|', @discard;
$filter = qr/\b(?:$filter)\b/;
my $twig = XML::Twig->new;
$twig->parse(\*DATA);
for my $line ( $twig->findnodes('//line') ) {
    $line->delete if $line->text =~ $filter;
}
$twig->print;

__DATA__
<data>
    <line> sdfe abc adsfefsdf </line>
    <line> abcsdffedcfsdf sdf </line>
    <line> sdfe </line>
    <line> abc </line>
    <line> sdabc sfefsdf </line>
    <line>
        <id> bond </id>
        <dest> UK </dest>
        adsfefsdf
    </line>
    <line> fhgh kk jj hjsda </line>
    <line> abc </line>
    ..
    ..
    ..
</data>

以上程序生成以下结果:

<data><line> abcsdffedcfsdf sdf </line><line> sdfe </line><line> sdabc sfefsdf </line><line> fhgh kk jj hjsda </line>
    ..
    ..
    ..
</data>

以下是所需的输出:

<data>
<line> sdfe </line>
<line> fhgh kk jj hjsda </line>
    ..
    ..
    ..
</data>

要考虑所需输出的条件:

  1. 匹配,预匹配,匹配数组中提供的输入值,并从存在的输入数据中删除标签
    例:      比赛---- abc
         赛前---- sdabc
         赛后---- abcsdffedcfsdf

  2. 确保输出格式与输入数据类似

  3. ** Match,Prematch和Postmatch是我的术语,如上所述。

1 个答案:

答案 0 :(得分:1)

您是否在询问如何过滤掉包含以line中的某个字符串开头或结尾的字词的@discard元素?如果是这样,只需使用以下内容替换搜索模式:

my $filter = join '|', map quotemeta, @discard;
$filter = "(?:$filter)";
$filter = qr/\b$filter|$filter\b/;

输出:

<data><line> sdfe </line><line> fhgh kk jj hjsda </line>
    ..
    ..
    ..
</data>