如何使用正面正则表达式前瞻匹配,但排除前瞻部分?

时间:2018-03-02 15:25:15

标签: regex regex-lookarounds

要匹配的行是

part1a_part1b__part1c_part1d_part3.extension
part1a_part1b__part1c_part1d__part3.extension
part1a_part1b__part1c_part1d_part2short_part3.extension
part1a_part1b__part1c_part1d_part2short__part3.extension
part1a_part1b__part1c_part1d_part2_part3.extension
part1a_part1b__part1c_part1d_part2__part3.extension
part1a_part1b__part1c_part1d_part2full_part3.extension
part1a_part1b__part1c_part1d_part2full__part3.extension
part1a_part1b__part1c_part1d_part2short-part3.extension
part1a_part1b__part1c_part1d_part2-part3.extension
part1a_part1b__part1c_part1d_part2full-part3.extension
part1a_part1b__part1c_part1d_part4.extension
part1a_part1b__part1c_part1d__part4.extension

对于除最后两行之外的所有上述行,所需匹配应准确part1a_part1b__part1c_part1d。也就是说,"干"具有part1optional part2 (in limited forms)的任意数字,并且必须以part3.extension结尾。

现在,我只有

(?P<stem>[[:alnum:]_-]+)(?=(|part2short|part2|part2full))[_-]+part3\.extension

,匹配&#34;茎&#34;上面几行的值是

part1a_part1b__part1c_part1d
part1a_part1b__part1c_part1d_
part1a_part1b__part1c_part1d_part2short
part1a_part1b__part1c_part1d_part2short_
part1a_part1b__part1c_part1d_part2
part1a_part1b__part1c_part1d_part2_
part1a_part1b__part1c_part1d_part2full
part1a_part1b__part1c_part1d_part2full_
part1a_part1b__part1c_part1d_part2short
part1a_part1b__part1c_part1d_part2
part1a_part1b__part1c_part1d_part2full    

如果有可能,你可以帮忙评论一下如何精确匹配除最后两行之外的所有上述行part1a_part1b__part1c_part1d吗?

2 个答案:

答案 0 :(得分:1)

你可以将前4个部分与文本和下划线匹配,并使用一个肯定的前瞻,断言字符串以part3.extension结尾:

^(?P<stem>[^_]+_[^_]+__[^_]+_[^_]+)(?=.*part3\.extension$)

匹配:

^                     # Begin of the string
(?P<stem>             # Named captured group stem
[^_]+_                # Match not _ one or more times, then _
[^_]+__               # Match not _ one or more times, then __
[^_]+_                # Match not _ one or more times, then _
[^_]+                 # # Match not _ one or more times
)                     # Close named capturing group
(?=                   # A positive lookahead that asserts what follows
  .*part3\.extension$ # Match part3.extension at the end of the string
)                     # Close lookahead

答案 1 :(得分:1)

你可以使用这个正则表达式使用非贪婪的匹配,一个带有可选匹配的前瞻:

(?m)^(?P<stem>[[:alnum:]_-]+?)(?=(?:[_-]+part2(?:short|full)?)?[_-]+part3\.extension$)

RegEx Demo

(?=(?:[_-]+part2(?:short|full)?)?[_-]+part3\.extension$)是一个积极的先行者,断言行以[-_]part3.extension结尾,并带有可选的[-_]part2...字符串。