Question

我一直在尝试使用antlr4为我需要的小模板系统创建一个解析器。模板就像你可以看到一些函数总是以相同数量的'{{'和'}}'开头，并且函数被定义为将被设置和执行，并且应该用该函数结果替换整个函数

问题在于我想将所有其他文本留在函数之外，只是解析我所定义的内容。现在STRING不匹配或仅保留其他文本。除了定义的函数之外，我想跳过或忽略所有全文。

最终目标是用函数的结果替换整个{{..}}部分。

有没有办法跳过所有其他文字？

以下是示例文本：

allkinds of text $#@ {{getMetaSelf option1}} blabla bla {{getEnv test test}} now repeat something: {{repeatPerInstance test ','}} 
this will get repeated. sub functions are possible here now: {{getMetaInstance option1}} blabla {{endrepeat}} more text.

这是我迄今为止所做的：

parse: EOF | (functions | STRING)* ;

functions : '{{' func STRING*
          ;

func : getterFunctions
     | 'repeatPerInstance ' KEYWORD ( ' ' delimiter )? '}}' ( '{{' ( getterFunctions | repeatSubFunctions ) )*  '{{endrepeat}}'
     ;

getterFunctions : 'getEnv ' KEYWORD ' ' KEYWORD '}}'
                | 'getMetaSelf ' metaoptions '}}'
                ;

repeatSubFunctions : 'getEnvRole' KEYWORD '}}'
                   | 'getMetaInstance' metaoptions '}}'
                   ;


metaoptions : 'option1'
            | 'option2'
            | 'option3'
            | 'option4'
            ;

delimiter : '\'' ',' '\'' ;

STRING : . +? ;

KEYWORD : [0-9A-Za-z\-\_]+ ;
WS  : [ \t\n\r]+ -> skip ;

Answer 1

您可以使用多模式词法分析器。由于文件以自由格式文本开头，因此默认模式将是包含TEXT规则的模式。

lexer grammar MyLexer;

TEXT
  : ( ~'{'
    | '{' {_input.LA(1) != '{'}?
    )+
  ;

OPEN_TAG
  : '{{' -> pushMode(InTag)
  ;

mode InTag;

  END_TAG
    : '}}' -> popMode
    ;

  // other rules for tokens inside of {{ ... }} go here

Answer 2

顺便说一下：

以下行看起来很奇怪，因为它在函数之后接受了STRING s ，这可能不是故意的：

函数：'{{'func STRING * ;

我还会以这种方式（未经测试）对函数进行建模，对{{ }}进行分组：

parse: (function | STRING)* EOF ; function : '{{' functionBody '}}' ; functionBody : getterFunctions | 'repeatPerInstance ' KEYWORD ( ' ' delimiter )? '}}' ( '{{' ( getterFunctions | repeatSubFunctions ) '}}')* '{{endrepeat' ; getterFunctions : 'getEnv ' KEYWORD ' ' KEYWORD | 'getMetaSelf ' metaoptions ; repeatSubFunctions : 'getEnvRole' KEYWORD | 'getMetaInstance' metaoptions ;

如何使用antlr4跳过不可解析的文本？

2 个答案: