Regexp并不匹配所有

时间:2016-07-31 17:14:15

标签: java regex

我有正则表达式,应删除注释(多行,单行)

(\s*\/\/.*)|((\/\*)(.|\n)+?(\*\/))

当我在Sublime或网站上查看时,它对我来说没问题。 但是当我在程序中使用它时,它没有正常工作。 这就是我的意思。

/**
 * Base class from which all RMI-IIOP stubs must inherit.
 */

public abstract class Stub extends ObjectImpl
implements java.io.Serializable {

// This can only be set at object construction time (no sync necessary).
private transient StubDelegate stubDelegate = null;

regexp仅匹配单行注释,但在其他文本编辑器中,它适用于所有类型。

这是我的代码:

FileInputStream fis = new FileInputStream(filePath);
    BufferedReader br = new BufferedReader(new InputStreamReader(fis));
    String strLine;
    while ((strLine = br.readLine()) != null) {
        Matcher m = Pattern.compile(delCommentsPattern,Pattern.MULTILINE).matcher(strLine);
        while (m.find()) {
            String b = m.group();
            System.out.println(b);
  }
}

我做错了什么?

UPD:添加delCommentsPattern:

 private String delCommentsPattern ="(\\s*\\/\\/.*)|((\\/\\*)(.|\\n)+?(\\*\\/))";

UPD2:我已经完成了,RAVI推荐了什么,但现在我有一个例外,当我在更大的文件上使用它时:

    Exception in thread "JavaFX Application Thread" java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    at javafx.fxml.FXMLLoader$MethodHandler.invoke(FXMLLoader.java:1774)
    at javafx.fxml.FXMLLoader$ControllerMethodEventHandler.handle(FXMLLoader.java:1657)
    at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)
    at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
    at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
    at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
    at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
    at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
    at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
    at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
    at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
    at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
    at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
    at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:49)
    at javafx.event.Event.fireEvent(Event.java:198)
    at javafx.scene.Node.fireEvent(Node.java:8411)
    at javafx.scene.control.Button.fire(Button.java:185)
    at com.sun.javafx.scene.control.behavior.ButtonBehavior.mouseReleased(ButtonBehavior.java:182)
    at com.sun.javafx.scene.control.skin.BehaviorSkinBase$1.handle(BehaviorSkinBase.java:96)
    at com.sun.javafx.scene.control.skin.BehaviorSkinBase$1.handle(BehaviorSkinBase.java:89)
    at com.sun.javafx.event.CompositeEventHandler$NormalEventHandlerRecord.handleBubblingEvent(CompositeEventHandler.java:218)
    at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:80)
    at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:238)
    at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:191)
    at com.sun.javafx.event.CompositeEventDispatcher.dispatchBubblingEvent(CompositeEventDispatcher.java:59)
    at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:58)
    at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
    at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
    at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
    at com.sun.javafx.event.BasicEventDispatcher.dispatchEvent(BasicEventDispatcher.java:56)
    at com.sun.javafx.event.EventDispatchChainImpl.dispatchEvent(EventDispatchChainImpl.java:114)
    at com.sun.javafx.event.EventUtil.fireEventImpl(EventUtil.java:74)
    at com.sun.javafx.event.EventUtil.fireEvent(EventUtil.java:54)
    at javafx.event.Event.fireEvent(Event.java:198)
    at javafx.scene.Scene$MouseHandler.process(Scene.java:3757)
    at javafx.scene.Scene$MouseHandler.access$1500(Scene.java:3485)
    at javafx.scene.Scene.impl_processMouseEvent(Scene.java:1762)
    at javafx.scene.Scene$ScenePeerListener.mouseEvent(Scene.java:2494)
    at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:380)
    at com.sun.javafx.tk.quantum.GlassViewEventHandler$MouseEventNotification.run(GlassViewEventHandler.java:294)
    at java.security.AccessController.doPrivileged(Native Method)
    at com.sun.javafx.tk.quantum.GlassViewEventHandler.lambda$handleMouseEvent$354(GlassViewEventHandler.java:416)
    at com.sun.javafx.tk.quantum.QuantumToolkit.runWithoutRenderLock(QuantumToolkit.java:389)
    at com.sun.javafx.tk.quantum.GlassViewEventHandler.handleMouseEvent(GlassViewEventHandler.java:415)
    at com.sun.glass.ui.View.handleMouseEvent(View.java:555)
    at com.sun.glass.ui.View.notifyMouse(View.java:937)
    at com.sun.glass.ui.gtk.GtkApplication._runLoop(Native Method)
    at com.sun.glass.ui.gtk.GtkApplication.lambda$null$49(GtkApplication.java:139)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
    at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
    at javafx.fxml.FXMLLoader$MethodHandler.invoke(FXMLLoader.java:1769)
    ... 48 more
Caused by: java.lang.StackOverflowError
    at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4843)
    at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
    at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568)
    at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4604)
    at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
    at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4847)
    at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
    at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568)
    at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4604)
    at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
    at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4847)
    at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
    at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568)
    at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4604)
    at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
    at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4847)
    at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
    at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568)
    at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
    at java.util.regex.Pattern$Branch.match(Pattern.java:4604)
    at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
    at java.util.regex.Pattern$LazyLoop.match(Pattern.java:4847)
    at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
    at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568)
    at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)

4 个答案:

答案 0 :(得分:4)

由于您逐行读取文件,因此正则表达式无法匹配跨越多行的文本。您需要先将整个文件读入一个字符串,然后执行while循环。

但请注意,您的正则表达式在以下代码中存在问题,看起来它开始发表评论:

List<File> files = FileUtils.find("src/*.java");   // Note the /*

您应该使用Java标记生成器并过滤其输出以进行注释,而不是正则表达式。 https://github.com/javaparser/javaparser看起来像是这项工作的合适工具。它甚至有一些test cases用于解析注释。

答案 1 :(得分:1)

正如其他人所说,在运行此正则表达式之前,必须将整个文件读入字符串。

请注意,为了解析注释,您必须在同一个分析字符串 时间,否则任何一个都可以隐藏在另一个中。

这是弦乐的正则表达式 由于您正在删除评论,因此此正则表达式会保留格式设置以便进行清除 移除后看起来并没有结合在一起。

"(?m)((?:(?:^[ \\t]*)?(?:/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/(?:[ \\t]*\\r?\\n(?=[ \\t]*(?:\\r?\\n|/\\*|//)))?|//(?:[^\\\\]|\\\\(?:\\r?\\n)?)*?(?:\\r?\\n(?=[ \\t]*(?:\\r?\\n|/\\*|//))|(?=\\r?\\n))))+)|(\"[^\"\\\\]*(?:\\\\[\\S\\s][^\"\\\\]*)*\"|'[^'\\\\]*(?:\\\\[\\S\\s][^'\\\\]*)*'|(?:\\r?\\n|[\\S\\s])[^/\"'\\\\\\s]*)"

将全部替换为$2

说明(如果需要)

   (?m)                             # Multi-line modifier
   (                                # (1 start), Comments 
        (?:
             (?: ^ [ \t]* )?                  # <- To preserve formatting
             (?:
                  /\*                              # Start /* .. */ comment
                  [^*]* \*+
                  (?: [^/*] [^*]* \*+ )*
                  /                                # End /* .. */ comment
                  (?:                              # <- To preserve formatting 
                       [ \t]* \r? \n                                      
                       (?=
                            [ \t]*                  
                            (?: \r? \n | /\* | // )
                       )
                  )?
               |  
                  //                               # Start // comment
                  (?:                              # Possible line-continuation
                       [^\\] 
                    |  \\ 
                       (?: \r? \n )?
                  )*?
                  (?:                              # End // comment
                       \r? \n                               
                       (?=                              # <- To preserve formatting
                            [ \t]*                          
                            (?: \r? \n | /\* | // )
                       )
                    |  (?= \r? \n )
                  )
             )
        )+                               # Grab multiple comment blocks if need be
   )                                # (1 end)

|                                 ## OR

   (                                # (2 start), Non - comments 
        "
        [^"\\]*                          # Double quoted text
        (?: \\ [\S\s] [^"\\]* )*
        "
     |  '
        [^'\\]*                          # Single quoted text
        (?: \\ [\S\s] [^'\\]* )*
        ' 
     |  (?: \r? \n | [\S\s] )            # Linebreak or Any other char
        [^/"'\\\s]*                      # Chars which doesn't start a comment, string, escape,
                                         # or line continuation (escape + newline)
   )                                # (2 end)

答案 2 :(得分:0)

由于您正在逐行处理,因此无法正常工作。

首先,您需要读取字符串中的完整文件。然后在上面使用正则表达式。

String strLine;
StringBuffer sb = new StringBuffer("");
while ((strLine = br.readLine()) != null) {
    sb.append(strLine + "\n");
}

Matcher m = Pattern.compile(delCommentsPattern,Pattern.MULTILINE).matcher(sb.toString());
while (m.find()) {
    String b = m.group();
    System.out.println(b);
}

答案 3 :(得分:-1)

如果我正确理解你要做的事情(即提取注释),请尝试使用此字符串作为Java中的正则表达式:

String regex = "(/\\*\\*.*\\*/)|(//.*?$)";
顺便说一句,你必须确保你逃避星号和其他保留字符。另一件事是你需要读取文件,如果你想做多行正则表达式(即如果正则表达式正在寻找多行注释,你就不能在一行上匹配)。

请务必使用DOTALL和MULTILINE

代码示例:

public static void main(String[] args) throws Exception {
    String text = readStream(new FileInputStream("test.txt"));
    Matcher m = Pattern.compile("(/\\*\\*.*\\*/)|(//.*?$)", Pattern.MULTILINE | Pattern.DOTALL).matcher(text);
    while (m.find()) {
        String b = m.group();
        System.out.println("GROUP: "+b);
    }
}

public static String readStream(InputStream is) {
    StringBuilder sb = new StringBuilder(512);
    try {
        Reader r = new InputStreamReader(is, "UTF-8");
        int c = 0;
        while ((c = r.read()) != -1) {
            sb.append((char) c);
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
    return sb.toString();
}