ANTLR赋值表达式消歧

时间:2011-09-08 20:32:28

标签: parsing antlr

以下语法有效,但也会发出警告:

test.g

grammar test;

options {
  language = Java;
  output = AST;
  ASTLabelType = CommonTree; 
}

program
  : expr ';'!
  ;

term: ID | INT
  ;

assign
  : term ('='^ expr)?
  ;

add : assign (('+' | '-')^ assign)*
  ;

expr: add
  ;

//   T O K E N S

ID  : (LETTER | '_') (LETTER | DIGIT | '_')* ;

INT : DIGIT+ ;

WS  :
    ( ' '
    | '\t'
    | '\r'
    | '\n'
    ) {$channel=HIDDEN;}
    ;

DOT : '.' ;

fragment
LETTER : ('a'..'z'|'A'..'Z') ;

fragment
DIGIT   : '0'..'9' ;

警告

[15:08:20] warning(200): C:\Users\Charles\Desktop\test.g:21:34: 
Decision can match input such as "'+'..'-'" using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input

同样,它 以我想要的方式生成树:

Input: 0 + a = 1 + b = 2 + 3;

ANTLR produces  | ... but I think it
this tree:      | gives the warning
                | because it _could_
  +             | also be parsed this
 / \            | way:
0   =           |  
   / \          |           +     
  a   +         |         /   \   
     / \        |        +     3  
    1   =       |     /     \     
       / \      |    +       =    
      b   +     |   / \     / \   
         / \    |  0   =   b   2  
        2   3   |     / \         
                |    a   1        

我如何明确地告诉ANTLR我希望它在左侧创建AST,从而使我的意图清晰并使警告静音?

1 个答案:

答案 0 :(得分:5)

  

Charles写道:

     

如何明确告诉ANTLR我希望它在左侧创建AST,从而使我的意图清晰并使警告静音?

您不应为assignadd创建两个单独的规则。正如您现在的规则一样,assign优先于您不想要的add:通过查看您想要的AST,它们应具有相同的优先级。因此,您需要将所有运算符+-=包装在一个规则中:

program
  :  expr ';'!
  ;

expr
  :  term (('+' | '-' | '=')^ expr)*
  ;

但现在语法仍然含糊不清。你需要“帮助”解析器超越这种模糊性,以确保在解析operator expr时确实存在(('+' | '-' | '=') expr)*。这可以使用syntactic predicate完成,如下所示:

(look_ahead_rule(s)_in_here)=> rule(s)_to_actually_parse

( ... )=>是谓词语法)

一个小小的演示:

grammar test;

options {
  output=AST;
  ASTLabelType=CommonTree; 
}

program
  :  expr ';'!
  ;

expr
  :  term ((op expr)=> op^ expr)*
  ;

op
  :  '+' 
  |  '-'
  |  '='
  ;

term
  :  ID 
  |  INT
  ;

ID  : (LETTER | '_') (LETTER | DIGIT | '_')* ;
INT : DIGIT+ ;
WS  : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};

fragment LETTER : ('a'..'z'|'A'..'Z');
fragment DIGIT  : '0'..'9';

可以在课堂上进行测试:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String source =  "0 + a = 1 + b = 2 + 3;";
    testLexer lexer = new testLexer(new ANTLRStringStream(source));
    testParser parser = new testParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.program().getTree();
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}

Main类的输出对应于以下AST:

enter image description here

创建没有来自ANTLR的任何警告。