Question

我正在创建一个Markdown-esque标记界面。例如，当用户键入**example string**时，正则表达式用于查找两次出现**（定义粗体文本），实际明文将更改为<b>**example string**</b>并呈现为HTML。

我的想法是将用户的输入解析为HTML：

对于正则表达式规则中的每个规则
每次出现start pattern（当前正则表达式规则）
在start pattern结束后结束所有文字（称之为start substring）

end pattern
中start substring的第一个实例
从文字
中取出substring(start_match.start() + end_match.end())
将其附加到最初为空的final text字符串

通过substring(start_match.start() + end_match.end())剔除剩余的文字，将其反馈到 2中的文字。

我的代码：

public static String process(String input_text) { String final_text = ""; String current_text = input_text; for (MarkdownRule rule : _rules) { Pattern s_ptrn = rule.getStartPattern(); // Start pattern Pattern e_ptrn = rule.getEndPattern(); // End pattern /* For each occurrence of the start pattern */ Matcher s_matcher = s_ptrn.matcher(current_text); while (s_matcher.find()) { int s_end = s_matcher.end(); int s_start = s_matcher.start(); /* Take all text after the end of start match */ String working_text = current_text.substring(s_end); // ERROR HERE /* For first instance of end pattern in remaining text */ Matcher e_matcher = e_ptrn.matcher(working_text); if (e_matcher.find()) { /* Take full substring from current text */ int e_end = e_matcher.end(); working_text = current_text.substring(s_start, s_end + e_end); /* Append to final text */ working_text = new StringBuilder(working_text).insert(0, "<b>").append("</b>").toString(); final_text = new StringBuilder(final_text).append(working_text).toString(); /* Remove working text from current text */ current_text = new StringBuilder(current_text).substring(s_start + e_end); } } } return final_text; }

虽然理论上这应该可以正常工作，但我在这一行上得到StringIndexOutOfBoundsException：

/* Take all text after the end of start match */ String working_text = current_text.substring(s_end);

当我使用输入文本**example**时。我相信第一次出现start pattern（在索引0和1处）它可以正常工作，但是然后字符串不会被正确剔除，然后在明文**上调用循环，这超出范围错误。（但我无法保证这一点 - 这正是我对自己的测试的看法）

不幸的是，我的故障排除无法修复错误。提前感谢您的帮助。

Answer 1

您正在改变（缩小）current_text

/* Remove working text from current text */
current_text = new StringBuilder(current_text).substring(s_start + e_end);

虽然匹配器已经存储了最初的current_text字符串，但无论您之后对current_text做了什么，都不会发生变化。

/* For each occurrence of the start pattern */
Matcher s_matcher = s_ptrn.matcher(current_text);

您需要为新字符串使用新的匹配器。

基于多个正则表达式规则插入字符串的算法

1 个答案: