我想用正则表达式提取源IP地址和信息。
以下是文本文件中的示例
"No.","Time","Source","Destination","Protocol","Length","Info","SrcPort","Dest.port","Response time","Frequency","delta"
"","2007-11-13 18:10:53.940873","127.0.0.1","127.0.0.1","HTTP","162","GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0 ","43974","80","0.000000","","0.000000"
I want to extract... ^ this ... and ... ^ this info
它可以包含数千行。我只是想从每一行中提取源IP地址和信息。
预期的输出是,
127.0.0.1 GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0
答案 0 :(得分:1)
如果您想纯粹使用正则表达式执行此操作:
public static void main(String[] args)
{
String s = "No.\",\"Time\",\"Source\",\"Destination\",\"Protocol\",\"Length\",\"Info\",\"SrcPort\",\"Dest.port\",\"Response time\",\"Frequency\",\"delta\",\"2007-11-13 18:10:53.940873\",\"127.0.0.1\",\"127.0.0.1\",\"HTTP\",\"162\",\"GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0 \",\"43974\",\"80\",\"0.000000\",\"\",\"0.000000";
Matcher m = Pattern.compile("(?m)(?<IP>\(\\d){3}\\.(\\d\\.){2}\\d\).*?(?<METHOD>GET|POST|PUT|DELETE)(?<URI>.*?(?<HTTPVERSION>HTTP\\/\\d(\\.\\d)?))").matcher(s);
m.find();
System.out.println("Result " + m.group("IP") + " " + m.group("METHOD") + " " + m.group("URI") + " " + m.group("HTTPVERSION"));
}
P.S。命名组从Java 7开始工作。我只是为了方便而使用命名组,没有命名组就可以获得相同的结果。无论如何,我不会依赖regexes来完成这些任务。如果你想添加一个规则,condicion等..正则表达式增长非常迅速。正则表达式不是魔术棒。请谨慎使用。
答案 1 :(得分:1)
如果您可以确保逗号永远不属于字段0-6
,您可以使用以下
String[] fields = s.split(",", 8);
System.out.println("source: " + fields[3]);
System.out.println("info : " + fields[6]);
如果您无法确保,则更喜欢使用CVS解析器而不是正则表达式解决方案。
答案 2 :(得分:0)
这匹配两个IP地址: (\ d {1,3})。(\ d {1,3})。(\ d {1,3})。(\ d {1,3}) 这个信息: (GET *。?)&#34; - &GT;这将为您提供第一组中的信息。
最好使用CSV解析器,尽管如评论中所建议的那样。
答案 3 :(得分:0)
如果你想要简单的Javacode和正则表达式。你可以试试这个解决方案的例子:
String text = "No.,Time,Source,Destination,Protocol,Length,Info,SrcPort,Dest.port,Response time,Frequency,delta,2007-11-13 18:10:53.940873,127.0.0.1,127.0.0.1,HTTP,162,GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0 ,43974,80,0.000000,,0.000000";
String[] texts = text.split(",");
StringBuilder output = new StringBuilder();
boolean foundIp = false;
for(String s : texts){
if(s.matches("^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$") && !foundIp){
output.append(s);
foundIp = true;
continue;
}
if(s.startsWith("GET") && s.trim().endsWith("HTTP/1.0")){
output.append(" ").append(s.trim());
continue;
}
}
System.out.println(output.toString());
您可以添加一些其他规则,例如找不到IP地址时不打印输出或其他内容。就像你想要的那样。
代码输出:
127.0.0.1 GET /scripts/..%25%35%63../winnt/system32/cmd.exe?/c+dir HTTP/1.0