如何使用正则表达式提取源IP地址和信息?

时间:2016-03-02 07:32:55

标签: java regex

我想用正则表达式提取源IP地址和信息。

以下是文本文件中的示例

"No.","Time","Source","Destination","Protocol","Length","Info","SrcPort","Dest.port","Response time","Frequency","delta"
"","2007-11-13 18:10:53.940873","127.0.0.1","127.0.0.1","HTTP","162","GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0 ","43974","80","0.000000","","0.000000"
             I want to extract...    ^ this    ... and ...                     ^ this info

它可以包含数千行。我只是想从每一行中提取源IP地址和信息。

预期的输出是,

127.0.0.1 GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0

4 个答案:

答案 0 :(得分:1)

如果您想纯粹使用正则表达式执行此操作:

public static void main(String[] args)
{   
    String s = "No.\",\"Time\",\"Source\",\"Destination\",\"Protocol\",\"Length\",\"Info\",\"SrcPort\",\"Dest.port\",\"Response time\",\"Frequency\",\"delta\",\"2007-11-13 18:10:53.940873\",\"127.0.0.1\",\"127.0.0.1\",\"HTTP\",\"162\",\"GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0 \",\"43974\",\"80\",\"0.000000\",\"\",\"0.000000";
    Matcher m = Pattern.compile("(?m)(?<IP>\(\\d){3}\\.(\\d\\.){2}\\d\).*?(?<METHOD>GET|POST|PUT|DELETE)(?<URI>.*?(?<HTTPVERSION>HTTP\\/\\d(\\.\\d)?))").matcher(s);
    m.find();
    System.out.println("Result " + m.group("IP") + " " + m.group("METHOD") + " " + m.group("URI") + " " + m.group("HTTPVERSION"));
}

P.S。命名组从Java 7开始工作。我只是为了方便而使用命名组,没有命名组就可以获得相同的结果。无论如何,我不会依赖regexes来完成这些任务。如果你想添加一个规则,condicion等..正则表达式增长非常迅速。正则表达式不是魔术棒。请谨慎使用。

答案 1 :(得分:1)

如果您可以确保逗号永远不属于字段0-6,您可以使用以下

String[] fields = s.split(",", 8);
System.out.println("source: " + fields[3]);
System.out.println("info  : " + fields[6]);

如果您无法确保,则更喜欢使用CVS解析器而不是正则表达式解决方案。

答案 2 :(得分:0)

这匹配两个IP地址: (\ d {1,3})。(\ d {1,3})。(\ d {1,3})。(\ d {1,3}) 这个信息: (GET *。?)&#34; - &GT;这将为您提供第一组中的信息。

最好使用CSV解析器,尽管如评论中所建议的那样。

答案 3 :(得分:0)

如果你想要简单的Javacode和正则表达式。你可以试试这个解决方案的例子:

    String text = "No.,Time,Source,Destination,Protocol,Length,Info,SrcPort,Dest.port,Response time,Frequency,delta,2007-11-13 18:10:53.940873,127.0.0.1,127.0.0.1,HTTP,162,GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0 ,43974,80,0.000000,,0.000000";

    String[] texts = text.split(",");
    StringBuilder output = new StringBuilder();

    boolean foundIp = false;
    for(String s : texts){
        if(s.matches("^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$") && !foundIp){
            output.append(s);
            foundIp = true;
            continue;
        }
        if(s.startsWith("GET") && s.trim().endsWith("HTTP/1.0")){
            output.append(" ").append(s.trim());
            continue;
        }
    }

    System.out.println(output.toString());

您可以添加一些其他规则,例如找不到IP地址时不打印输出或其他内容。就像你想要的那样。

代码输出:

127.0.0.1 GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0