为什么正则表达式模式与文本不匹配?

时间:2019-11-28 06:53:48

标签: c# .net regex cron

我有一个正则表达式模式,我想将其与我的cron表达式匹配:

string pattern = @"(((([0-9]|[0-5][0-9])(-([0-9]|[0-5][0-9]))?,)*([0-9]|[0-5][0-9])(-([0-9]|[0-5][0-9]))?)|(([\\*]|[0-9]|[0-5][0-9])/([0-9]|[0-5][0-9]))|([\\?])|([\\*]))[\\s](((([0-9]|[0-5][0-9])(-([0-9]|[0-5][0-9]))?,)*([0-9]|[0-5][0-9])(-([0-9]|[0-5][0-9]))?)|(([\\*]|[0-9]|[0-5][0-9])/([0-9]|[0-5][0-9]))|([\\?])|([\\*]))[\\s](((([0-9]|[0-1][0-9]|[2][0-3])(-([0-9]|[0-1][0-9]|[2][0-3]))?,)*([0-9]|[0-1][0-9]|[2][0-3])(-([0-9]|[0-1][0-9]|[2][0-3]))?)|(([\\*]|[0-9]|[0-1][0-9]|[2][0-3])/([0-9]|[0-1][0-9]|[2][0-3]))|([\\?])|([\\*]))[\\s](((([1-9]|[0][1-9]|[1-2][0-9]|[3][0-1])(-([1-9]|[0][1-9]|[1-2][0-9]|[3][0-1]))?,)*([1-9]|[0][1-9]|[1-2][0-9]|[3][0-1])(-([1-9]|[0][1-9]|[1-2][0-9]|[3][0-1]))?(C)?)|(([1-9]|[0][1-9]|[1-2][0-9]|[3][0-1])/([1-9]|[0][1-9]|[1-2][0-9]|[3][0-1])(C)?)|(L(-[0-9])?)|(L(-[1-2][0-9])?)|(L(-[3][0-1])?)|(LW)|([1-9]W)|([1-3][0-9]W)|([\\?])|([\\*]))[\\s](((([1-9]|0[1-9]|1[0-2])(-([1-9]|0[1-9]|1[0-2]))?,)*([1-9]|0[1-9]|1[0-2])(-([1-9]|0[1-9]|1[0-2]))?)|(([1-9]|0[1-9]|1[0-2])/([1-9]|0[1-9]|1[0-2]))|(((JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))?,)*(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))?)|((JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)/(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))|([\\?])|([\\*]))[\\s]((([1-7](-([1-7]))?,)*([1-7])(-([1-7]))?)|([1-7]/([1-7]))|(((MON|TUE|WED|THU|FRI|SAT|SUN)(-(MON|TUE|WED|THU|FRI|SAT|SUN))?,)*(MON|TUE|WED|THU|FRI|SAT|SUN)(-(MON|TUE|WED|THU|FRI|SAT|SUN))?(C)?)|((MON|TUE|WED|THU|FRI|SAT|SUN)/(MON|TUE|WED|THU|FRI|SAT|SUN)(C)?)|(([1-7]|(MON|TUE|WED|THU|FRI|SAT|SUN))?(L|LW)?)|(([1-7]|MON|TUE|WED|THU|FRI|SAT|SUN)#([1-7])?)|([\\?])|([\\*]))([\\s]?(([\\*])?|(19[7-9][0-9])|(20[0-9][0-9]))?| (((19[7-9][0-9])|(20[0-9][0-9]))/((19[7-9][0-9])|(20[0-9][0-9])))?| ((((19[7-9][0-9])|(20[0-9][0-9]))(-((19[7-9][0-9])|(20[0-9][0-9])))?,)*((19[7-9][0-9])|(20[0-9][0-9]))(-((19[7-9][0-9])|(20[0-9][0-9])))?)?)";
  string text = "0 0 0 ? APR,MAY * 2020,2021,2022"
  Match match = Regex.Match(text, pattern);

//the match output is: "0 0 0 ? APR,MAY *";

模式不是我无法弄清楚原因的年代。

编辑:很抱歉上一篇文章中的javascript标签现在我想解释一下为什么这样做。.net项目中是ac#,我想在前端检查一下,所以我很体贴,这就是为什么我放下javascript标签。抱歉,这可能不好。
关于主题:我有一个cron表达式,例如在“字符串文本”变量中的上方。我想检查给定的cron表达式是否有效。我们可以认为cron表达式有7个字段
 “第二分钟小时DayOfMount月DayOfWeek年” 所有字段之间都有一个空格。字符串模式覆盖除年以外的所有时间。

1 个答案:

答案 0 :(得分:0)

该表达式是为POSIX正则表达式引擎编写/生成的。 POSIX引擎尝试匹配最长的子匹配项。

C#是.NET语言,并使用基于PCRE的引擎。它尝试匹配最左边的子匹配。

例如,模式[0-9]|[0-5][0-9]在POSIX引擎中将尝试将“ 59”匹配为“ 5”。 PCRE引擎尝试匹配最左侧的引擎,因此它将仅尝试使用[0-9]替代项,并在成功时停止。

在您的模式中,对于基于PCRE的引擎,替代的顺序不是最佳的,有时是错误的。这是一个重写的模式:

( #1
    (
        ([0-5][0-9]|[0-9])
        (-([0-5][0-9]|[0-9]))?
        ,
    )*
    ([0-5][0-9]|[0-9])
    (-([0-5][0-9]|[0-9]))?
  | ([*]|[0-5][0-9]|[0-9])
    \/
    ([0-5][0-9]|[0-9])
  | [?]
  | [*]
)
[\s]
( #2
    (
        ([0-5][0-9]|[0-9])
        (-([0-5][0-9]|[0-9]))?
        ,
    )*
    ([0-5][0-9]|[0-9])
    (-([0-5][0-9]|[0-9]))?
  | ([*]|[0-5][0-9]|[0-9])
    \/
    ([0-5][0-9]|[0-9])
  | [?]
  | [*]
)
[\s]
( #3
    (
        ([0-1][0-9]|[2][0-3]|[0-9])
        (-([0-1][0-9]|[2][0-3]|[0-9]))?
        ,
    )*
    ([0-1][0-9]|[2][0-3]|[0-9])
    (-([0-1][0-9]|[2][0-3]|[0-9]))?
  | ([*]|[0-1][0-9]|[2][0-3]|[0-9])
    \/
    ([0-1][0-9]|[2][0-3]|[0-9])
  | [?]
  | [*]
)
[\s]
( #4
    (
        ([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])
        (-([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9]))?
        ,
    )*
    ([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])
    (-([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9]))?
    C?
  | ([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])
    \/
    ([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])
    C?
  | LW
  | L(-([1-2][0-9]|[3][0-1]|[0-9]))?
  | [1-3][0-9]W
  | [1-9]W
  | [?]
  | [*]
)
[\s]
( #5
    (
        (0[1-9]|1[0-2]|[1-9])
        (-(0[1-9]|1[0-2]|[1-9]))?
        ,
    )*
    (0[1-9]|1[0-2]|[1-9])
    (-(0[1-9]|1[0-2]|[1-9]))?
  | (0[1-9]|1[0-2]|[1-9])
    \/
    (0[1-9]|1[0-2]|[1-9])
  | (
        (JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)
        (-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))?
        ,
    )*
    (JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)
    (-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))?
  | (JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)
    \/
    (JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)
  | [?]
  | [*]
)
[\s]
( #6
    (
        [1-7]
        (-[1-7])?
        ,
    )*
    [1-7]
    (-[1-7])?
  | [1-7]\/[1-7]
  | (
        (MON|TUE|WED|THU|FRI|SAT|SUN)
        (-(MON|TUE|WED|THU|FRI|SAT|SUN))?
        ,
    )*
    (MON|TUE|WED|THU|FRI|SAT|SUN)
    (-(MON|TUE|WED|THU|FRI|SAT|SUN))?
    C?
  | (MON|TUE|WED|THU|FRI|SAT|SUN)
    \/
    (MON|TUE|WED|THU|FRI|SAT|SUN)
    C?
  | ([1-7]|MON|TUE|WED|THU|FRI|SAT|SUN)
    [#]
    [1-7]
  | ([1-7]|MON|TUE|WED|THU|FRI|SAT|SUN)
    (LW|L)?
  | [?]
  | [*]
)
( #7
    [\s]
    (
        (19[7-9][0-9]|20[0-9][0-9])
        (-(19[7-9][0-9]|20[0-9][0-9]))?
        ,
    )*
    (19[7-9][0-9]|20[0-9][0-9])
    (-(19[7-9][0-9]|20[0-9][0-9]))?
  | [\s]
    (19[7-9][0-9]|20[0-9][0-9])
    \/
    (19[7-9][0-9]|20[0-9][0-9])
  | [\s]
    [*]
)?

或压缩为字符串:

var pattern = @"((([0-5][0-9]|[0-9])(-([0-5][0-9]|[0-9]))?,)*([0-5][0-9]|[0-9])(-([0-5][0-9]|[0-9]))?|([*]|[0-5][0-9]|[0-9])\/([0-5][0-9]|[0-9])|[?]|[*])[\s]((([0-5][0-9]|[0-9])(-([0-5][0-9]|[0-9]))?,)*([0-5][0-9]|[0-9])(-([0-5][0-9]|[0-9]))?|([*]|[0-5][0-9]|[0-9])\/([0-5][0-9]|[0-9])|[?]|[*])[\s]((([0-1][0-9]|[2][0-3]|[0-9])(-([0-1][0-9]|[2][0-3]|[0-9]))?,)*([0-1][0-9]|[2][0-3]|[0-9])(-([0-1][0-9]|[2][0-3]|[0-9]))?|([*]|[0-1][0-9]|[2][0-3]|[0-9])\/([0-1][0-9]|[2][0-3]|[0-9])|[?]|[*])[\s]((([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])(-([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9]))?,)*([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])(-([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9]))?C?|([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])\/([0][1-9]|[1-2][0-9]|[3][0-1]|[1-9])C?|LW|L(-([1-2][0-9]|[3][0-1]|[0-9]))?|[1-3][0-9]W|[1-9]W|[?]|[*])[\s](((0[1-9]|1[0-2]|[1-9])(-(0[1-9]|1[0-2]|[1-9]))?,)*(0[1-9]|1[0-2]|[1-9])(-(0[1-9]|1[0-2]|[1-9]))?|(0[1-9]|1[0-2]|[1-9])\/(0[1-9]|1[0-2]|[1-9])|((JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))?,)*(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(-(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC))?|(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\/(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)|[?]|[*])[\s](([1-7](-[1-7])?,)*[1-7](-[1-7])?|[1-7]\/[1-7]|((MON|TUE|WED|THU|FRI|SAT|SUN)(-(MON|TUE|WED|THU|FRI|SAT|SUN))?,)*(MON|TUE|WED|THU|FRI|SAT|SUN)(-(MON|TUE|WED|THU|FRI|SAT|SUN))?C?|(MON|TUE|WED|THU|FRI|SAT|SUN)\/(MON|TUE|WED|THU|FRI|SAT|SUN)C?|([1-7]|MON|TUE|WED|THU|FRI|SAT|SUN)[#][1-7]|([1-7]|MON|TUE|WED|THU|FRI|SAT|SUN)(LW|L)?|[?]|[*])([\s]((19[7-9][0-9]|20[0-9][0-9])(-(19[7-9][0-9]|20[0-9][0-9]))?,)*(19[7-9][0-9]|20[0-9][0-9])(-(19[7-9][0-9]|20[0-9][0-9]))?|[\s](19[7-9][0-9]|20[0-9][0-9])\/(19[7-9][0-9]|20[0-9][0-9])|[\s][*])?";

您可能还希望在模式的开头添加^,并在模式的末尾添加$。这样可以确保在匹配的CRON表达式之前或之后没有多余的字符。

var pattern = @"^((([0-5] ... [\s][*])?$";