正如我在Antlr greedy-option中所描述的那样,我对一种语言有一些问题,这种语言可能包含字符串文字内的字符串文字,例如:
START: "img src="test.jpg""
先生。 Bart Kiers在我的帖子中提到,不可能创建一个可以解决我的问题的语法。因此我决定将语言改为:
START: "img src='test.jpg'"
在启动词法分析器(和解析器)之前。
文件输入可以是:
START: "aaa"aaa" "aaa"aaaaa" :END_START START: "aaa"aaa" "aaa"aa a aa" :END_START START: "aaab"bbaaaa" :END_START
所以我有一个解决方案,但是它不正确。我有关于我的问题的两个问题(在代码下面)。我的代码是:
public static void main(String[] args) {
try{
FileInputStream fis = new FileInputStream("src/file.txt");
String preparedCode = preparingCode(fis);
ANTLRStringStream in = new ANTLRStringStream(preparedCode);
TestLexer lex = new TestLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lex);
TestParser parser = new TestParser(tokens);
parser.rule();
}catch(IOException ex){
ex.printStackTrace();
} catch (RecognitionException e) {
System.out.println(e.getMessage());
System.exit(0);
}
}
static String preparingCode(FileInputStream input){
DataInputStream data = new DataInputStream(input);
StringBuilder oldCode = new StringBuilder();
StringBuffer newCode = new StringBuffer(oldCode.length());
Pattern pattern = Pattern.compile("(START:\\s\")(.+)(\"\\n:END_START)");
String strLine;
try{
while ((strLine = data.readLine()) != null)
oldCode.append(strLine + "\n");
}
catch(IOException ex){
ex.printStackTrace();
}
Matcher matcher = pattern.matcher(oldCode);
while (matcher.find()) {
//eliminate quotes inside a string literal
String stringLiteral = matcher.group(2).replaceAll("\"", "'");
String replace = matcher.group(1) + stringLiteral + matcher.group(3);
matcher.appendReplacement(newCode, Matcher.quoteReplacement(replace));
}
matcher.appendTail(newCode);
System.out.println(newCode);
return newCode.toString();
}
我的问题是:
START: "aaa'aaa' 'aaa'aaaaa" :END_START START: "aaa'aaa' 'aa'aa a aa" :END_START START: "aaab'bbaaaa" :END_START
我玩模式标志Pattern.DOTALL
Pattern pattern = Pattern.compile("(START:\s\")(.+)(\"\n:END_START)", Pattern.DOTALL);
但这不是解决方案,因为在这种情况下它匹配所有内容......
- 如果我使用正确的模式,还有其他有效的方法来解决它吗?
的非贪婪方法
Pattern pattern = Pattern.compile("(START:\\s\")(.+?)(\"\\n:END_START)", Pattern.DOTALL);
答案 0 :(得分:0)
修复第一个问题
我必须使用模式标志Pattern.DOTALL:
Pattern pattern = Pattern.compile("(START:\\s\")(.+?)(\"\\n:END_START)", Pattern.DOTALL);
代码:
public static void main(String[] args) {
try{
FileInputStream fis = new FileInputStream("src/file.txt");
String preparedCode = preparingCode(fis);
ANTLRStringStream in = new ANTLRStringStream(preparedCode);
TestLexer lex = new TestLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lex);
TestParser parser = new TestParser(tokens);
parser.rule();
}catch(IOException ex){
ex.printStackTrace();
} catch (RecognitionException e) {
System.out.println(e.getMessage());
System.exit(0);
}
}
static String preparingCode(FileInputStream input){
DataInputStream data = new DataInputStream(input);
StringBuilder oldCode = new StringBuilder();
StringBuffer newCode = new StringBuffer(oldCode.length());
Pattern pattern = Pattern.compile("(START:\\s\")(.+?)(\"\\n:END_START)", Pattern.DOTALL);
String strLine;
try{
while ((strLine = data.readLine()) != null)
oldCode.append(strLine + "\n");
}
catch(IOException ex){
ex.printStackTrace();
}
Matcher matcher = pattern.matcher(oldCode);
while (matcher.find()) {
System.out.println("++++"+matcher.group(2));
//eliminate quotes inside a string literal
String stringLiteral = matcher.group(2).replaceAll("\"", "'");
String replace = matcher.group(1) + stringLiteral + matcher.group(3);
matcher.appendReplacement(newCode, Matcher.quoteReplacement(replace));
}
matcher.appendTail(newCode);
System.out.println(newCode);
return newCode.toString();
}
那么有什么其他方法可以解决这个问题吗?