是否可以使用EBNF描述块注释?

时间:2019-08-27 19:47:56

标签: parsing comments context-free-grammar ebnf

说,我有以下EBNF:

document    = content , { content } ;
content     = hello world | answer | space ;
hello world = "hello" , space , "world" ;
answer      = "42" ;
space       = " " ;

这让我可以解析以下内容:

hello world 42

现在,我想使用块注释来扩展此语法。如何正确执行此操作?

如果我从简单开始:

document    = content , { content } ;
content     = hello world | answer | space | comment;
hello world = "hello" , space , "world" ;
answer      = "42" ;
space       = " " ;
comment     = "/*" , ?any character? , "*/" ;

我无法解析:

Hello /* I'm the taxman! */ World 42

如果我从上面开始用特殊情况进一步扩展语法,它将变得很丑陋,但可以解析。

document    = content , { content } ;
content     = hello world | answer | space | comment;
hello world = "hello" , { comment } , space , { comment } , "world" ;
answer      = "42" ;
space       = " " ;
comment     = "/*" , ?any character? , "*/" ;

但是我仍然无法解析如下内容:

Hel/*p! I need somebody. Help! Not just anybody... */lo World 42

如何使用EBNF语法做到这一点?还是根本不可能?

2 个答案:

答案 0 :(得分:2)

假设您将“ hello”视为令牌,那么您将不希望有任何东西来破坏它。如果您需要这样做,则有必要分解规则:

hello_world = "h", {comment}, "e", {comment}, "l", {comment}, "l", {comment}, "o" ,
              { comment }, space, { comment },
              "w", {comment}, "o", {comment}, "r", {comment}, "l", {comment}, "d" ;

考虑到更广泛的问题,似乎不常将语言评论描述为正式语法的一部分,而是将其作为附带说明,这是司空见惯的事情。但是,通常可以通过将注释等同于空白来实现:

space = " " | comment ;

您可能还需要考虑添加一个描述连续空白的规则:

spaces = { space }- ;

清理最终的语法,但将“ hello”和“ world”视为标记(即,不允许将它们分解),可能会导致以下情况:

document    = { content }- ;
content     = hello world | answer | space ;
hello world = "hello" , spaces , "world" ;
answer      = "42" ;
spaces      = { space }- ;
space       = " " | comment ;
comment     = "/*" , ?any character? , "*/" ;

答案 1 :(得分:1)

  

如何使用EBNF语法做到这一点?还是根本不可能?

某些语言在预处理器中删除注释,某些语言用空格替换注释。删除注释似乎是解决此问题的最简单方法。但是,此解决方案通常会从文字中删除注释,而通常不会这样做。

document = preprocess, process;

preprocess = {(? any character ? - comment, ? append char to text ?)},
    ? text for input to process ?;

comment = "/*", {? any character ? - "*/"}, "*/", ? discard ?;

process = {content}-;

content = hello world | answer | spaces;

hello world = ("H" | "h"), "ello", spaces, ("W" | "w") , "orld";

answer = "42";

spaces = {" "}-;

给定的预处理器

Hello /* I'm the taxman! */ World 42

产生

Hello  World 42

注意两个空格。

而且,

Hel/*p! I need somebody. Help! Not just anybody... */lo World 42

产生

Hello World 42