对于给定的输入示例:
70.80.110.200 - - [12/Apr/2011:05:47:34 +0000] "GET /notify/click?r=http://www.xxxxxx.com/hello_world&rt=1302587231462&iid=00000 HTTP/1.1" 302 0 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FunWebProducts; HotbarSearchToolbar 1.1; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; AskTbFWV5/5.11.3.15590)" 4 4
我想定义以下解析逻辑(可能是正则表达式)
答案 0 :(得分:3)
尝试:
/^([0-9.]+).*?\[(\d+\/\w+\/\d+):(\d+:\d+:\d+).*?\].*?(\/[^ ]*).*$/
如您所料,在以下组(1,2,3,4)中,您将获得指定的所有数据 - 例如.group(3)
是时间。
答案 1 :(得分:2)
确保将Jetty配置为执行与NSCA兼容的日志记录,然后您可以使用任何NCSA日志分析器来分析日志。
如果你想手工完成,那么这是正则表达式的一个很好的用例。
答案 2 :(得分:1)
完整的代码示例(基于hsz's answer):
import java.util.*;
import java.util.regex.*;
public class RegexDemo {
public static void main( String[] argv ) {
String pat = "^([0-9.]*).*?\\[(\\d+\\/\\w+\\/\\d+):(\\d+:\\d+:\\d+).*?\\].*?(\\/[^ ]*).*$";
Pattern p = Pattern.compile(pat);
String target = "70.80.110.200 - - [12/Apr/2011:05:47:34 +0000] \"GET /notify/click?r=http://www.xxxxxx.com/hello_world&rt=1302587231462&iid=00000 HTTP/1.1\" 302 0 \"-\" \"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FunWebProducts; HotbarSearchToolbar 1.1; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; AskTbFWV5/5.11.3.15590)\" 4 4";
Matcher m = p.matcher(target);
System.out.println("pattern: " + pat);
System.out.println("target: " + target);
if (m.matches()) {
System.out.println("found");
for (int i=0; i <= m.groupCount(); ++i) {
System.out.println(m.group(i));
}
}
}
}
答案 3 :(得分:0)
您可以尝试以下操作:
String s = "70.80.110.200 - - [12/Apr/2011:05:47:34 +0000] \"GET /notify/click?r=http://www.xxxxxx.com/hello_world&rt=1302587231462&iid=00000 HTTP/1.1\" 302 0 \"-\" \"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; FunWebProducts; HotbarSearchToolbar 1.1; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; AskTbFWV5/5.11.3.15590)\" 4 4";
Pattern p = Pattern.compile("^(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?\\" + //ip
"[([^:]*):"+ //date
"(\\d{2}:\\d{2}:\\d{2}).*?\\].*?"+ //time
"(/[^\\s]*).*$"); //uri
Matcher m = p.matcher(s);
if(m.find()){
String ip = m.group(1);
String date = m.group(2);
String time = m.group(3);
String uri = m.group(4);
}