use strict;
use warnings;
use XML::Twig;
my @discard = qw / abc de bond/;
my $filter = join '|', @discard;
$filter = qr/\b(?:$filter)\b/;
my $twig = XML::Twig->new;
$twig->parse(\*DATA);
for my $line ( $twig->findnodes('//line') ) {
$line->delete if $line->text =~ $filter;
}
$twig->print;
__DATA__
<data>
<line> sdfe abc adsfefsdf </line>
<line> abcsdffedcfsdf sdf </line>
<line> sdfe </line>
<line> abc </line>
<line> sdabc sfefsdf </line>
<line>
<id> bond </id>
<dest> UK </dest>
adsfefsdf
</line>
<line> fhgh kk jj hjsda </line>
<line> abc </line>
..
..
..
</data>
以上程序生成以下结果:
<data><line> abcsdffedcfsdf sdf </line><line> sdfe </line><line> sdabc sfefsdf </line><line> fhgh kk jj hjsda </line>
..
..
..
</data>
以下是所需的输出:
<data>
<line> sdfe </line>
<line> fhgh kk jj hjsda </line>
..
..
..
</data>
要考虑所需输出的条件:
匹配,预匹配,匹配数组中提供的输入值,并从存在的输入数据中删除标签
例:
比赛---- abc
赛前---- sdabc
赛后---- abcsdffedcfsdf
确保输出格式与输入数据类似
** Match,Prematch和Postmatch是我的术语,如上所述。
答案 0 :(得分:1)
您是否在询问如何过滤掉包含以line
中的某个字符串开头或结尾的字词的@discard
元素?如果是这样,只需使用以下内容替换搜索模式:
my $filter = join '|', map quotemeta, @discard;
$filter = "(?:$filter)";
$filter = qr/\b$filter|$filter\b/;
输出:
<data><line> sdfe </line><line> fhgh kk jj hjsda </line>
..
..
..
</data>