ANTLR模糊解析

时间:2013-09-03 11:49:45

标签: antlr antlr3

我正在ANTLRv3中构建一种预处理器,当然只能用于模糊解析。目前我正在尝试解析include语句并用相应的文件内容替换它们。我用这个例子:

ANTLR: removing clutter

基于这个例子,我写了下面的代码:

grammar preprocessor;

options {
    language='Java';
}

@lexer::header {

package antlr_try_1;

}

@parser::header {

package antlr_try_1;

}

parse
 : (t=. {System.out.print($t.text);})* EOF
 ;

INCLUDE_STAT
 : 'include' (' ' | '\r' | '\t' | '\n')+ ('A'..'Z' | 'a'..'z' | '_' | '-' | '.')+
   {
     setText("Include statement found!");
   }
 ;

Any
 : . // fall through rule, matches any character
 ;

此语法仅用于打印文本并使用“找到包含语句”替换include语句。串。要解析的示例文本如下所示:

some random input
some random input
some random input

include some_file.txt

some random input
some random input
some random input

结果的输出以下列方式显示:

C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 1:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 2:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 3:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 7:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 8:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 9:14 mismatched character 'p' expecting 'c'
some random ut
some random ut
some random ut

Include statement found!

some random ut
some random ut
some random ut

据我所知,它被“输入”一词中的“in”搞糊涂了,因为它“认为”它将是INCLUDE_STAT令牌。

有更好的方法吗?我不能使用过滤器选项,因为我不仅需要include语句,还需要其余的代码。我已经尝试过其他一些东西,但找不到合适的解决方案。

1 个答案:

答案 0 :(得分:1)

您正在观察ANTLR 3的一个限制。您可以使用以下任一选项来纠正当前问题:

  1. 升级到ANTLR 4,没有此限制。
  2. INCLUDE_STAT规则的开头包含以下语法谓词:

    `('include' (' ' | '\r' | '\t' | '\n')+ ('A'..'Z' | 'a'..'z' | '_' | '-' | '.')+) =>`