排除/包含正则表达式模式

时间:2018-03-22 12:38:24

标签: regex pattern-matching regex-negation

我想尝试制作更多类型的正则表达式,所以我一直在努力做以下工作。

这是我的表达:https://regex101.com/r/VzspFy/4/

在测试字符串上,前3个是好的,所以这样的模式必须匹配,问题是最后一个,我不想包含它,所以我试着这样做:< / p>

https://regex101.com/r/9HVKTK/2

和此:

https://regex101.com/r/9HVKTK/1

但没有运气!

主要思想是:

`aaa ... bbb ccc` -> must match
`ccc ... (aaa|ddd|eee) ... bbb ccc` -> should not match

我如何才能使其发挥作用或者更好的实施?

2 个答案:

答案 0 :(得分:1)

您可以使用

var rx = new Regex(@"(?:^|])(?:(?!\b(?:eng|ita)\b)[^]])*\b(eng(?:\W+\w+)?\W+sub\W+ita)\b", RegexOptions.Compiled | RegexOptions.IgnoreCase);

请参阅regex demo。您需要获得第1组值。

模式详情

  • (?:^|]) - 字符串的开头或](如果您有多行字符串作为输入,请添加| RegexOptions.Multiline,但我认为这些都是独立的字符串)
  • (?:(?!\b(?:eng|ita)\b)[^]])* - 尽可能多的]字符,但不会开始整个单词engita(请参阅tempered greedy token了解这构造得更好)
  • \b - 字边界
  • (eng(?:\W+\w+)?\W+sub\W+ita) - 第1组:
    • eng - 文字子字符串
    • (?:\W+\w+)? - 任意1个非字字符的可选序列,后跟1个字字符(实际上是一个可选字)
    • \W+ - 1 +非单词字符
    • sub - 文字子字符串
    • \W+ - 1 +非单词字符
    • ita - 文字子字符串
  • \b - 字边界

请参阅C# demo

var strs = new List<string> { 
        "Lucifer S03e15 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni",
        "Lucifer S03e15 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni",
        "Lucifer S03e01-08 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni SEASON PREMIERE",
        "Young Sheldon S01e13 [SATRip 720p - H264 - Eng Ac3 - Sub Ita] HDTV by AVS",
        "Young Sheldon S01e08 [Mux 1080p - H264 - Ita Eng Ac3 - Sub Ita Eng] WEBMux Morpheus",
        "Young Sheldon S01e08 [Mux 1080p - H264 - Ita Eng Ac3 - Sub Ita Eng] WEBMux Morpheus",
        "Young Sheldon S01e14 [SATRip 720p - H264 - Eng Ac3 - Sub Ita] HDTV by AVS",
        "Lucifer S03e15 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni",
        "Lucifer S03e16 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni",
        "Lucifer S02e01-13 [XviD - Eng Mp3 - Sub Ita] DLRip by Pir8 [CURA] Fede e Religioni FULL ",
        "Absentia S01e01-10 [Mux 1080p - H264 - Ita Eng Ac3 - Sub Ita Eng] By Morpheus The.Breadwinner.2017.ENG.Sub.ITA.HDRip.XviD-[WEB]"
    };
var rx = new Regex(@"(?:^|])(?:(?!\b(?:eng|ita)\b)[^]])*\b(eng(?:\W+\w+)?\W+sub\W+ita)\b", RegexOptions.Compiled | RegexOptions.IgnoreCase);
foreach (var s in strs)
{
    Console.WriteLine(s);
    var result = rx.Match(s);
    if (result.Success)
        Console.WriteLine("Matched: {0}", result.Groups[1].Value);
    else
        Console.WriteLine("No match!");
    Console.WriteLine("==========================================");
}

输出:

Lucifer S03e15 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni
Matched: Eng Mp3 - Sub Ita
==========================================
Lucifer S03e15 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni
Matched: Eng Mp3 - Sub Ita
==========================================
Lucifer S03e01-08 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni SEASON PREMIERE
Matched: Eng Mp3 - Sub Ita
==========================================
Young Sheldon S01e13 [SATRip 720p - H264 - Eng Ac3 - Sub Ita] HDTV by AVS
Matched: Eng Ac3 - Sub Ita
==========================================
Young Sheldon S01e08 [Mux 1080p - H264 - Ita Eng Ac3 - Sub Ita Eng] WEBMux Morpheus
No match!
==========================================
Young Sheldon S01e08 [Mux 1080p - H264 - Ita Eng Ac3 - Sub Ita Eng] WEBMux Morpheus
No match!
==========================================
Young Sheldon S01e14 [SATRip 720p - H264 - Eng Ac3 - Sub Ita] HDTV by AVS
Matched: Eng Ac3 - Sub Ita
==========================================
Lucifer S03e15 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni
Matched: Eng Mp3 - Sub Ita
==========================================
Lucifer S03e16 [XviD - Eng Mp3 - Sub Ita Eng] DLRip By Pir8 [CURA] Fede e Religioni
Matched: Eng Mp3 - Sub Ita
==========================================
Lucifer S02e01-13 [XviD - Eng Mp3 - Sub Ita] DLRip by Pir8 [CURA] Fede e Religioni FULL 
Matched: Eng Mp3 - Sub Ita
==========================================
Absentia S01e01-10 [Mux 1080p - H264 - Ita Eng Ac3 - Sub Ita Eng] By Morpheus The.Breadwinner.2017.ENG.Sub.ITA.HDRip.XviD-[WEB]
Matched: ENG.Sub.ITA
==========================================

答案 1 :(得分:0)

这是一个相对简单的问题正则表达式:

(?:(?<=[-]\s)(?:ITA\s)?\w{3}\s\w{3}\s[-]\s\w{3}\s\w{3}\s\w{3}\b)|(?:Eng\.sub\.ita)

你可以test out here

REGEX:

(?<=[-]\s)是一个积极的后瞻,确保匹配前面有短划线和空格(但不匹配)

(?:ITA\s)?是一个非捕获组,它告诉正则表达式,如果匹配前面有“ITA”和空格,那么也匹配它们。

\w{3}匹配三个单词字符的字符串(字母/数字/下划线或它们的组合)

\s表示单个空格,

[-]只是匹配单个-的一种奇特方式。

|(?:Eng\.sub\.ita)告诉正则表达式匹配eng.sub.ita(不区分大小写)以及原始匹配(如果一起出现在句子中)。

请注意:

如果节目的名称包含- red SEO - two one或'dash-space-three_letters-space-three_letters-space-dash-space-three_letters-space-three_letters'的内容,那么甚至名称也是如此节目将匹配。

但是,包含此类格式的节目的可能性可以忽略不计,因此您无需担心。