Question

我不知道如何使用＆＃39;（＆＃39;，＆＃39;）＆＃39;和＆＃39; *＆＃39;可以在评论中。评论是多行的。

Answer 1

要处理的简单模式是：

\(\*(.*?)\*\)

示例：http://www.rubular.com/r/afqLCDssIx

您可能还想设置单行标记(?s)\(\*(.*?)\*\)

请注意，它不会处理字符串中的(*或其他奇怪组合。你最好的办法是使用一个解析器，例如ANTLR，它已经准备好了Pascal grammar（direct link）。

Answer 2

如果要查找/ * * / example

的最内层嵌套注释

/* 
/*
comment1
/*
comment2
*/
*/
*/

正则表达式

\/\*[^/*]*(?:(?!\/\*|\*\/)[/*][^/*]*)*\*\/

这会找到

/*
comment2
*/

Answer 3

关于嵌套注释的处理，虽然你不能使用Java正则表达式来匹配最外面的注释，但是你可以创建一个与匹配的注释< em> innermost 评论（有一些值得注意的例外 - 请参阅下面的警告）。（注意：\(\*(.*?)\*\)表达式在这种情况下不起作用，因为它不能正确匹配最内层的注释。）以下是一个经过测试的java程序，它使用一个（评论很多的）正则表达式，它只匹配最里面的注释，并且以迭代方式应用此方法来正确地删除嵌套注释：

public class TEST {
    public static void main(String[] args) {
        String subjectString = "out1 (* c1 *) out2 (* c2 (* c3 *) c2 *) out3";
        String regex = "" +
            "# Match an innermost pascal '(*...*)' style comment.\n" +
            "\\(\\*      # Comment opening literal delimiter.\n" +
            "[^(*]*      # {normal*} Zero or more non'(', non-'*'.\n" +
            "(?:         # Begin {(special normal*)*} construct.\n" +
            "  (?!       # If we are not at the start of either...\n" +
            "    \\(\\*  # a nested comment\n" +
            "  | \\*\\)  # or the end of this comment,\n" +
            "  ) [(*]    # then ok to match a '(' or '*'.\n" +
            "  [^(*]*    # more {normal*}.\n" +
            ")*          # end {(special normal*)*} construct.\n" +
            "\\*\\)      # Comment closing literal delimiter.";
        String resultString = null;
        java.util.regex.Pattern p = java.util.regex.Pattern.compile(
                    regex,
                    java.util.regex.Pattern.COMMENTS);
        java.util.regex.Matcher m = p.matcher(subjectString);
        while (m.find())
        { // Iterate until there are no more "(* comments *)".
            resultString = m.replaceAll("");
            m = p.matcher(resultString);
        }
        System.out.println(resultString);
    }
}

以下是正则表达式的简短版本（采用原生正则表达式格式）：

\(\*[^(*]*(?:(?!\(\*|\*\))[(*][^(*]*)*\*\)

请注意，此正则表达式实现了Jeffrey Friedl的“Unrolling-the-loop”高效技术并且非常快。（见：Mastering Regular Expressions (3rd Edition)）。

警告：如果任何评论分隔符（即(*或*)）出现在字符串文字中，这肯定无法正常工作，因此，不应该用于一般情况解析。但像这样的正则表达式可以随时使用 - 例如在编辑器中进行快速和脏的搜索。

有关想要处理嵌套C风格评论的人，请参阅我对a similar question的回答。

如何使用Java正则表达式查找（* comments *）？

3 个答案: