在我的日志文件中,有些条目是 -
1. IP428702 - - [02/Sep/2017:18:44:27 +0200] "GET /?ln=de HTTP/1.1" 200 4858 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 122026 0 NOSSL
2. 22354 - - [01/Sep/2017:07:12:06 +0200] "GET / HTTP/1.1" 200 18359 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1" 131909 0 NOSSL
3. IP428702 - - [02/Sep/2017:18:42:14 +0200] "GET /search?ln=en&sc=1&p=1&action_search=1 HTTP/1.1" 200 9490 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"'`--" 2155371 2 NOSSL
4. IP428702 - - [02/Sep/2017:18:42:43 +0200] "GET /search?ln=en&sc=1&p=&action_search= HTTP/1.1" 200 9796 "http://doc.rero.ch/search?l...\"'`--" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 5776261 5 NOSSL
5. IP173839 - - [02/Sep/2017:12:09:55 +0200] "GET /server/document/get_indexing?page_nr=16&from=&to=&url=http://doc.rero.ch/record/1... HTTP/1.1" 200 131113 "http://doc.rero.ch/client/fr//" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112
6. IP423766 - - [01/Sep/2017:14:30:25 +0200] "GET /record/11876/files/bulletin_vals_asla_2007_085.pdf?version=1'\" HTTP/1.1" 200 6847339 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; iebar; acc=none; SV1; snprtz|S04087544802137; .NET CLR 1.1.4322)" 241381 0 NOSSL
IP427 - - [01/Sep/2017:14:30:25 +0200] "GET /record/258826/export/xd?ln=en HTTP/1.1" 200 441 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search..." 114963 0 NOSSL
我用来读取日志条目的代码是
data = pd.read_csv(
'path_to loffile',
sep=r'\s+(?=(?:[^"]*"[^"]*")*[^"]*$)(?![^\[]*\])',
engine='python', names = ["ip", "time", "request",
"status","size",
"referer", "user_agent"],skipfooter = 1,
usecols = [0,3,4,5,6,7,8])
它返回的是 -
"IP423766 - - [01/Sep/2017:14:30:25 +0200] "GET "
如何从条目中获取所有内容?