EBNF用于捕获两个可选值之间的逗号

时间:2019-05-07 17:47:21

标签: regex big-o ebnf

我有两个可选值,并且当两个都存在时,它们之间必须使用逗号。如果存在一个或两个值,则可能存在 ,但如果不存在任何值,则不允许使用逗号。

有效示例:

(first,second,)
(first,second)
(first,)
(first)
(second,)
(second)
()

无效示例:

(first,first,)
(first,first)
(second,second,)
(second,second)
(second,first,)
(second,first)
(,first,second,)
(,first,second)
(,first,)
(,first)
(,second,)
(,second)
(,)
(,first,first,)
(,first,first)
(,second,second,)
(,second,second)
(,second,first,)
(,second,first)

我有足够的EBNF代码(XML-flavored),但是有什么方法可以简化它?我想使其更具可读性/减少重复。

tuple ::= "(" ( ( "first" | "second" | "first" "," "second" ) ","? )? ")"

如果在正则表达式中更容易理解,请使用等效代码,但我需要EBNF解决方案。

/\(((first|second|first\,second)\,?)?\)/

这是一个有用的铁路图: "(" ( ( "first" | "second" | "first" "," "second" ) ","? )? ")"

当我们将其抽象为三个术语时,这个问题变得更加复杂:"first""second""third"都是可选的,但必须出现按此顺序,以逗号分隔,并带有可选的尾随逗号。我能想到的最好的方法是蛮力方法:

"(" (("first" | "second" | "third" | "first" "," "second" | "first" "," "third" | "second" "," "third" | "first" "," "second" "," "third") ","?)? ")"

"(" (("first" | "second" | "third" | "first" "," "second" | "first" "," "third" | "second" "," "third" | "first" "," "second" "," "third") ","?)? ")" 显然,涉及 O(2 n 复杂度的解决方案不是很理想。

3 个答案:

答案 0 :(得分:0)

此表达式可能会帮助您设计更好的表达式。您可以仅使用捕获组并从左向右滑动并传递可能的输入来完成此操作,也许与此类似:

\((first|second|)(,|)(second|)([\)|,]+)

我只是猜测您希望捕获中间逗号:

enter image description here

这可能不是您想要的确切表达。但是,它可能向您显示了如何以一种简单的方式完成此操作:

^(?!\(,)\((first|)(,|)(second|)([\)|,]+)$

enter image description here

您可以在表达式的左侧和右侧添加更多边界,可能类似于this expression

enter image description here

此图显示了第二个表达式的工作方式:

enter image description here

性能

此JavaScript代码段使用简单的100万次for循环来显示第二个表达式的性能,以及如何使用first和{捕获second$1 {1}}。

$3

答案 1 :(得分:0)

我不熟悉EBNF,但我熟悉BNF和解析器语法。以下只是基于我自己的正则表达式的内容的变体。我假设未加引号的括号不被视为标记,而是用于对相关元素进行分组。

  tuple ::= ( "(" ( "first,second" | "first" | "second" ) ","? ")" ) | "()"
  • 它在(first,second(first(second上匹配
  • 然后在可选的,上匹配
  • 紧接着是括号。 )
  • 或空的parens分组。 ()

但是我怀疑这是一个进步。

这是我的Java测试代码。测试数据中的字符串的前两行匹配。其他人没有。

      String[] testdata = {
            "(first,second,)", "(first,second)", "(first,)", "(first)",
            "(second,)", "(second)", "()",

            "(first,first,)", "(first,first)", "(second,second,)",
            "(second,second)", "(second,first,)", "(second,first)",
            "(,first,second,)", "(,first,second)", "(,first,)", "(,first)",
            "(,second,)", "(,second)", "(,)", "(,first,first,)",
            "(,first,first)", "(,second,second,)", "(,second,second)",
            "(,second,first,)", "(,second,first)"
      };

      String reg = "\\(((first,second)|first|second),?\\)|\\(\\)";
      Pattern p = Pattern.compile(reg);

      for (String t : testdata) {
         Matcher m = p.matcher(t);
         if (m.matches()) {
            System.out.println(t);
         }
      }

答案 2 :(得分:0)

我找到了一种简化它的方法,但是并没有太多:

"(" ( ("first" ("," "second")? | "second") ","? )? ")"

"(" ( ("first" ("," "second")? | "second") ","? )? ")"

对于三个学期的解决方案,请采用两个学期的解决方案,并在第一学期之前添加

"(" (("first" ("," ("second" ("," "third")? | "third"))? | "second" ("," "third")? | "third") ","?)? ")"

"(" (("first" ("," ("second" ("," "third")? | "third"))? | "second" ("," "third")? | "third") ","?)? ")"

对于任何(n + 1)项解决方案,采用n项解决方案并在第一项之前添加。这种复杂度是 O(n),它明显优于 O(2 n