我正在使用标准化版本(ISO / IEC 14997:1996(E))EBNF来定义我的语法。 标准化版本是一种元元语言(它可以解析自己)。
我将letter
定义为:
letter = 'A' | 'B' | 'C' | 'D' | 'E' | 'H' | 'I' | 'J' | 'K' | 'L' |
'O' | 'P' | 'Q' | 'R' | 'S' | 'V' | 'W' | 'X' | 'Y' | 'Z' | 'a' | 'b'
| 'c' | 'd' | 'e' | 'h' | 'i' | 'j' | 'k' | 'l' | 'o' | 'p' | 'q' |
'r' | 's' | 'v' | 'w' | 'x' | 'y' | 'z' 'F' | 'G' | 'M' | 'N' | 'T' |
'U' | 'f' | 'g' | 'm' | 'n' | 't' | 'u';
我更愿意写一下letter = [a..z]|[A..Z];
我的问题是:以这种形式定义letter
(使用正则表达式)会毁掉自我定义的EBNF属性吗?
答案 0 :(得分:1)
为此使用特殊序列:
特殊序列由特殊序列符号组成 然后是一个(可能是空的)特殊序列 - 序列字符后跟一个特殊序列 - 符号
由特殊序列表示的符号序列 超出本国际标准的范围。只有 特殊序列的格式在本国际中定义 标准。特殊序列提供了一种表示法 用户可能需要的扩展名。
W3C广泛使用它。例如:
The formal grammar of XML is given in this specification using a simple Extended Backus-Naur Form (EBNF) notation. Each rule in the grammar defines one symbol, in the form symbol ::= expression Symbols are written with an initial capital letter if they are the start symbol of a regular language, otherwise with an initial lowercase letter. Literal strings are quoted. Within the expression on the right-hand side of a rule, the following expressions are used to match strings of one or more characters: #xN where N is a hexadecimal integer, the expression matches the character whose number (code point) in ISO/IEC 10646 is N. The number of leading zeros in the #xN form is insignificant. [a-zA-Z], [#xN-#xN] matches any Char with a value in the range(s) indicated (inclusive). [abc], [#xN#xN#xN] matches any Char with a value among the characters enumerated. Enumerations and ranges can be mixed in one set of brackets. [^a-z], [^#xN-#xN] matches any Char with a value outside the range indicated. [^abc], [^#xN#xN#xN] matches any Char with a value not among the characters given. Enumerations and ranges of forbidden values can be mixed in one set of brackets. "string" matches a literal string matching that given inside the double quotes. 'string' matches a literal string matching that given inside the single quotes. These symbols may be combined to match more complex patterns as follows, where A and B represent simple expressions: (expression) expression is treated as a unit and may be combined as described in this list. A? matches A or nothing; optional A. A B matches A followed by B. This operator has higher precedence than alternation; thus A B | C D is identical to (A B) | (C D). A | B matches A or B. A - B matches any string that matches A but does not match B. A+ matches one or more occurrences of A. Concatenation has higher precedence than alternation; thus A+ | B+ is identical to (A+) | (B+). A* matches zero or more occurrences of A. Concatenation has higher precedence than alternation; thus A* | B* is identical to (A*) | (B*). Other notations used in the productions are: /* ... */ comment. [ wfc: ... ] well-formedness constraint; this identifies by name a constraint on well-formed documents associated with a production. [ vc: ... ] validity constraint; this identifies by name a constraint on valid documents associated with a production.
<强>参考强>