解析包含用户输入的自由文本

时间:2016-02-17 16:13:10

标签: java regex parsing

我正在尝试在Java中解析以下格式的字符串:

Number-Action-Msg, Number-Action-Msg, Number-Action-Msg, Number-Action-Msg, ...

例如

"512-WARN-Cannot update the name.,615-PREVENT-The app is currently down, please try again later.,736-PREVENT-Testing,"

我想获得一个包含以下条目的数组:

512-WARN-Cannot update the name.
615-PREVENT-The app is currently down, please try again later.
736-PREVENT-Testing

问题是消息是用户输入的,所以我不能仅依靠逗号分割字符串。操作将始终为WARN或PREVENT。完成此解析的最佳方法是什么?谢谢!

2 个答案:

答案 0 :(得分:3)

似乎很简单:

正则表达式:

WARN|PREVENT

Regular expression visualization

Debuggex Demo

在java中:

String string = "512-WARN-Cannot update the name.,615-PREVENT-The app is currently down, please try again later.,736-PREVENT-Testing,";
String regex = "WARN|PREVENT";

System.out.println(Arrays.toString(string.split(regex)));

将输出:

[512-, -Cannot update the name.,615-, -The app is currently down, please try again later.,736-, -Testing,]

当然,您可能需要调整正则表达式添加-,例如:

String regex = "-WARN-|-PREVENT-";

答案 1 :(得分:3)

您可以使用此基于前瞻性的正则表达式进行匹配,而不是使用逗号分割:

(?=,\d+-(?:WARN|PREVENT)|,$)

RegEx Demo

digits-(WARN|PREVENT)是一个积极的先行者,断言有一个逗号,后跟{{1}}或前面的行尾。