所以我需要解析这个输入文件,我似乎无法弄清楚如何去做。我尝试过使用scanner.Delimiter()
,但仍然遇到问题。任何人如何理解如何正确地做到这一点?
以下是输入文件中的一行:
200.88.223.98 - - [01 / Feb / 2007:04:02:22 -0500]“GET / gallery / v / events / album02 / contests / programmingContest05 /?g2_GALLERYSID = 3be9666f9c07e16b7f33e2ea8acb8dd2& g2_fromNavId = x332be852 HTTP / 1.1” 200 52464“http://cs.tcnj.edu/gallery/main.php?g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02 %2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2& g2_GALLERYSID = 3be9666f9c07e16b7f33e2ea8acb8dd2& g2_returnName = album“”Opera / 6.01(Windows 98; U)[en]“
假设分成这样的部分:
address = 200.88.223.98
date = 01/Feb/2007:04:02:22 -0500
request = GET /gallery/v/events/album02/contests/programmingContest05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_fromNavId=x332be852 HTTP/1.1
status = 200
bytes = 52464
refer = http://cs.tcnj.edu/gallery/main.php?
g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2&g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_returnName=album
agent = Opera/6.01 (Windows 98; U) [en]
以下是我的代码试图解析它的部分:
Scanner scan = new Scanner(input);
scan.useDelimiter("[-']+");
while (scan.hasNextLine())
{
String address = scan.next();
String date = scan.next();
String request = scan.next();
int status = scan.nextInt();
int bytes = scan.nextInt();
String refer = scan.next();
String agent = scan.next();
}
显示以下错误:
Exception in thread "main" java.util.InputMismatchException
at java.util.Scanner.throwFor(Scanner.java:840)
at java.util.Scanner.next(Scanner.java:1461)
at java.util.Scanner.nextInt(Scanner.java:2091)
at java.util.Scanner.nextInt(Scanner.java:2050)
at Analyzer.start(Unknown Source)
at Driver.main(Unknown Source)
Java Result: 1
答案 0 :(得分:0)
想一想。 按空格拆分行并提取数据
String s = "200.88.223.98 - - [01/Feb/2007:04:02:22 -0500] \"GET /gallery/v/events/album02/contests/programmingContest05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_fromNavId=x332be852 HTTP/1.1\" 200 52464 \"http://cs.tcnj.edu/gallery/main.php?g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2&g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_returnName=album\" \"Opera/6.01 (Windows 98; U) [en]\"";
String arr [] = s.split(" ");
for(int i =0 ;i<arr.length;i++){
System.out.println(i+" - "+arr[i]);
}
Out out是:
0 : 200.88.223.98
1 : -
2 : -
3 : [01/Feb/2007:04:02:22
4 : -0500]
5 : "GET
6 : /gallery/v/events/album02/contests/programmingContest05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_fromNavId=x332be852
7 : HTTP/1.1"
8 : 200
9 : 52464
10 : "http://cs.tcnj.edu/gallery/main.php?g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2&g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_returnName=album"
11 : "Opera/6.01
12 : (Windows
13 : 98;
14 : U)
15 : [en]"
所以第0个元素给你的ip,第3和第4个给你的日期,6和7nt给你的请求,所以你可以提取你的数据。