正则表达式仅匹配以特定单词结尾且不包含其他单词的字符串

时间:2016-01-15 09:00:55

标签: regex regex-lookarounds regex-greedy

我试图制作一个正则表达式来匹配一个以'Remixes'结尾的字符串,但只有当它没有某些单词和字符之前。我提出了以下正则表达式,结果不同但两者都不完全匹配:

project_name.vcxproj -> path\to\output\directory\library_name.dll

这会排除字符串中的所有关键字,但不包括多个单词,例如: Think Twice Remixes ,或者它包含一个前面的单词,如:各种混音

{{1}}

这排除了以下示例: Fill Me Up + Remix ,但没有排除关键字的其他示例,例如 Sides&再混合

如果排除单词是唯一的第一个单词,如何使第一个字符串匹配字符串包含多个前面的单词并且不匹配?

1 个答案:

答案 0 :(得分:1)

老实说,我不会。 regex是一个功能强大的工具,您可以使用它做很多事情,但是当您不尝试单一正则表达式时,您的代码会变得更加简单和清晰。每一个问题。

对于您的示例,我很想使用perl的grep函数,它允许您指定复合条件:

 my @filtered = grep { m/Remixes$/ 
                     and not   
                        m/(And
                             |The
                             |Of
                             |Various
                             |House
                             |Unreleased
                             |Selected
                         )\s*.?\s+Remixes/xi } @list_of_things

E.g:

#!/usr/bin/env perl
use strict;
use warnings;

#set up a list of words to exclude when prefixing "Remix"
#qw is perl's "quote words" and lets you specify whitespace delimited values. 
my @exclude_remix_prefix = qw ( And
    The
    Of
    Various
    House
    Unreleased
    Selected );

#turn that into a sub regex (qr 'compiles' a regex). 
my $exclude = join( "|", @exclude_remix_prefix );
$exclude = qr/($exclude)\s+Remixes/i;

#read from the <DATA> filehandle, 
#but you could use <> to read from STDIN/filenames like 'sed/grep' do. 
my @filtered = grep { m/Remixes$/i and not m/$exclude/i; } <DATA>;

print @filtered;

__DATA__
Fill Me Up + Remixes
Sides & Remixes
Something Selected remixes

输出:

Fill Me Up + Remixes
Sides & Remixes

(给我一些应该/不应该匹配的样本,我会扩展)

我们可能会偏离您原来的用例,但如果您想创建转换模式:

#!/usr/bin/env perl
use strict;
use warnings;

use Data::Dumper;

my @exclude_remix_prefix = qw ( And
    The
    Of
    Various
    House
    Unreleased
    Selected );

my $exclude = join( "|", @exclude_remix_prefix );
$exclude = qr/($exclude)\s+Remixes/i;

my %transform = map { m/$exclude/ ? () :  m/(.*)/ =>  m/(.*)\s+Remixes/ ; } <DATA>;
print Dumper \%transform; 

__DATA__
Euterpeh Remixes
The Beauty And The Beast Remixes
Think Twice Remixes
Stop And Reset Remixes

这会生成一个包含以下内容的哈希:

$VAR1 = {
          'The Beauty And The Beast Remixes' => 'The Beauty And The Beast',
          'Think Twice Remixes' => 'Think Twice',
          'Euterpeh Remixes' => 'Euterpeh',
          'Stop And Reset Remixes' => 'Stop And Reset'
        };

您可以使用哪个来生成一系列重命名操作?

或者,如果你只是想“到位”&#39;一些操作,然后是for循环:

for ( <DATA> ) { 
    chomp; 
    next if m/$exclude/; 
    print "rename ", m/(.*)\s+Remixes/, " ", m/(.*)/,"\n";
}

(好吧,我知道&#39;重命名&#39;不是你想要做的,但......)