Java模式匹配器多重优化

时间:2018-01-29 16:01:00

标签: java regex pattern-matching matcher

我必须准备一个使用多个正则表达式的程序。 下面是程序,但是对于每个新元素,我必须初始化新的Pattern和Matcher。有没有我可以在这个程序中使用的优化方法或我可以在其中使用的循环。 目前在程序中它的读取文件并在字符串中分配所有XML标记。该文件有多个标签,我必须初始化并以CSV格式打印,文件中有大量数据,如GB。 有没有办法优化下面的代码。

正则表达式计划:

/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package javaapplication1;

/**
*
* @author ????
*/
import java.io.*;
import java.util.ArrayList;
import java.util.regex.*;

public class JavaApplication1 {

    /**
    * @param args the command line arguments
    */

    public static void main(String[] args) throws Exception {
        // TODO code application logic here
        String oldContent = "";
        BufferedReader reader = null;
        FileWriter writer = null;
        File fileToBeModified = new File("C:\\Documents\\audit.log");
        String str = "";
        try {
            BufferedReader in = new BufferedReader(new FileReader(fileToBeModified));
            StringBuffer output = new StringBuffer();
            String st;
            while ((st=in.readLine()) != null) {
                output.append(st);
                output.append('\n');
            }
            str = output.toString();
            in.close();
        }
        catch (Exception fx) {
        } 
        String usr = "";
        String origin = "";
        String dt = "";
        String operation = "";
        Pattern p = Pattern.compile("<AuditEntry>(.*\\R)*?<\\/AuditEntry>");
        Matcher m = p.matcher(str);
        while (m.find()) {
            Pattern e = Pattern.compile("<User>(.*)</User>");
            Matcher f = e.matcher(m.group(0));
            while(f.find())
            {
                usr = f.group(1);
            }
            Pattern g = Pattern.compile("<Origin>(.*)</Origin>");
            Matcher h = g.matcher(m.group(0));
            while(h.find())
            {
                origin = h.group(1);
            }
            Pattern i = Pattern.compile("<DateTime.*\">(.*)</DateTime>");
            Matcher j = i.matcher(m.group(0));
            while(j.find())
            {
                dt = j.group(1);
            }
            Pattern k = Pattern.compile("Operation=\"([a-zA-z]+)\"");
            Matcher l = k.matcher(m.group(0));
            while(l.find())
            {
                operation = l.group(1);
            }
            System.out.println(usr+","+origin+"," +dt+ ","+operation);
        }
    }
}

1 个答案:

答案 0 :(得分:0)

评论中提到的散列图听起来合适但不太可读。我更喜欢&#34;方法&#34;的方法:

    private String getMatch(final String pattern, final String txt) {
    final Pattern p = Pattern.compile(pattern);
    final Matcher m = p.matcher(txt);
    String result = null;
    while (m.find()) {
        result = m.group(1);
    }
    return result;
}

public void xxx() {
    //...
    while (m.find()) {
        final String txt = m.group(0);
        origin = getMatch("<Origin>(.*)</Origin>", txt);
        usr = getMatch("<User>(.*)</User>", txt);
        // ...
    }
}