我正在尝试使用ANTLR语法解析一些消息。消息带有结构:
:20:REF123456
:72:Some narrative text which
may contain new lines and
occassionally other : characters
:80A:Another field
目标输出是一个表,其中冒号之间的文本作为“键”,文本直到下一个键作为该键的值。例如:
Key | Values
--------------------------------------
20 | REF123456
72 | Some narrative text which
may contain new lines and
occassionally other : characters
80 | Another field
我可以编写一个语法来执行此操作,只要根据以下参考http://danielveselka.blogspot.fr/2011/02/antlr-swift-fields-parser.html
在值字段中不允许使用冒号有人可以就如何解决这个问题提供指导吗?
答案 0 :(得分:2)
我跳过v3并使用ANTLR v4。在v4中如何执行此操作的快速演示将如下所示:
grammar Swift;
parse
: entries? EOF
;
entries
: entry ( LINE_BREAK entry )*
;
entry
: key value
;
key
: ':' DATA ':'
;
value
: line ( LINE_BREAK line )*
;
line
: ( DATA | SPACES ) ( COLON | DATA | SPACES )*
;
LINE_BREAK
: '\r'? '\n'
| '\r'
;
COLON
: ':'
;
DATA
: ~[\r\n: \t]+
;
SPACES
: [ \t]+
;
现在你需要做的就是将一个监听器附加到一个树型助行器上,然后监听以发现enterEntry
次出现并捕获key
和value
文本。以下是如何做到这一点:
public class Main {
public static void main(String[] args) throws Exception {
String input = ":20:REF123456\n" +
":72:Some narrative text which\n" +
"may contain new lines and\n" +
"occassionally other : characters\n" +
":80A:Another field";
SwiftLexer lexer = new SwiftLexer(new ANTLRInputStream(input));
SwiftParser parser = new SwiftParser(new CommonTokenStream(lexer));
ParseTreeWalker.DEFAULT.walk(new SwiftBaseListener(){
@Override
public void enterEntry(@NotNull SwiftParser.EntryContext ctx) {
String key = ctx.key().getText().replace(":", "");
String value = ctx.value().getText().replaceAll("\\s+", " ");
System.out.printf("key -> %s\nvalue -> %s\n================\n", key, value);
}
}, parser.parse());
}
}
运行上面的演示将在您的控制台上打印以下内容:
key -> 20 value -> REF123456 ================ key -> 72 value -> Some narrative text which may contain new lines and occassionally other : characters ================ key -> 80A value -> Another field ================