如何使用Java中的java.regex.Matcher解析日志文件

时间:2014-04-11 10:29:59

标签: java regex parsing

我试图在java中隐藏正则表达式。我正在使用java中的日志文件,因此我可以提取日志字段。例如,我有以下一行:

Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2"

我希望输出如下:

"Date&Time" = Apr 10 21:08:55
"Hostname" = kali
"Program Name" = sshd
"Log" = Failed password for root from 127.0.0.1 port 42035 ssh2

到目前为止,这是我的java代码:

public class LogRegExp {

public static void main(String argv[]) {
    String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
    String logEntryPattern = "(\\w.+) (\\d.+) (\\w.+) (\\w.+)";

    Pattern p = Pattern.compile(logEntryPattern);
    Matcher matcher = p.matcher(logEntryLine);
    if (!matcher.matches()) {
        System.err.println("Bad log entry (or problem with RE?):");
        System.err.println(logEntryLine);
        return;
    }
    System.out.println("Date&Time: " + matcher.group(1));
        System.out.println("Hostname: " + matcher.group(2));
    System.out.println("Program Name: " + matcher.group(3));
        System.out.println("Log: " + matcher.group(4));

}

我尝试过以下示例:http://www.java2s.com/Code/Java/Development-Class/ParseanApachelogfilewithRegularExpressions.htm

但我无法适应我的需要。我理解如何应用esacape字符,数字等,但我不知道如何适应我的情况。有人可以帮帮我吗?

3 个答案:

答案 0 :(得分:3)

使用此代码:

public class LogRegExp {

    public static void main(String argv[]) {
        String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
        String logEntryPattern = "([\\w]+\\s[\\d]+\\s[\\d:]+)\\s([\\w]+)\\s([\\w]+)\\[.+\\]:\\s(.+)";

        Pattern p = Pattern.compile(logEntryPattern);
        Matcher matcher = p.matcher(logEntryLine);
        if (!matcher.matches()) {
            System.err.println("Bad log entry (or problem with RE?):");
            System.err.println(logEntryLine);
            return;
        }
        System.out.println("Date&Time: " + matcher.group(1));
        System.out.println("Hostname: " + matcher.group(2));
        System.out.println("Program Name: " + matcher.group(3));
        System.out.println("Log: " + matcher.group(4));

    }
}

答案 1 :(得分:1)

您可以对代码进行以下修改:

public class LogRegExp {

    public static void main(String argv[]) {
        String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
        String logEntryPattern = "([\\w]+\\s[\\d]+\\s[\\d:]+) (\\w+) (\\w{4})(\\[\\d{5}\\]:) (\\w.+)";

        Pattern p = Pattern.compile(logEntryPattern);
        Matcher matcher = p.matcher(logEntryLine);
        if (!matcher.matches()) {
            System.err.println("Bad log entry (or problem with RE?):");
            System.err.println(logEntryLine);
            return;
        }
        System.out.println("Date&Time: " + matcher.group(1));
        System.out.println("Hostname: " + matcher.group(2));
        System.out.println("Program Name: " + matcher.group(3));
        System.out.println("Log: " + matcher.group(5));

    }
}

该程序的输出是:

Date&Time: Apr 10 21:08:55
Hostname: kali
Program Name: sshd
Log: Failed password for root from 127.0.0.1 port 42035 ssh2

答案 2 :(得分:0)

尝试使用此模式:

String logEntryPattern = "(.+\\d\\d?:\\d\\d?:\\d\\d?) (\\S+) ([^\\[]+)\\S+ (.+)";
                                   hh::mm::ss