我有许多大型日志文件,看起来像这样:
DATETIME ["2015-03-03 21:52"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST1"]
DATETIME ["2015-03-03 21:53"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","CCC"]
POST ["POST_JSON","DDD","TEST2"]
DATETIME ["2015-03-03 21:54"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST3"]
DATETIME ["2015-03-03 21:55"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","EEE","TEST4"]
我想搜索大约2个关键字(它们之间是换行符)。 GET-Line中的一个特定单词和POST-Line中的一个特定单词。
我需要类似的东西:
grep "GET(.*)AAA(.*)POST(.*)BBB"
我搜索:AAA(在GET-Line中)&& BBB(在POST-Line中)
预期的结果:
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
使用哪种简单方法可行?
答案 0 :(得分:1)
使用GNU awk为第3个arg匹配():
$ find . -type f |
xargs gawk -v RS= 'match($0,/\nGET.*AAA.*\n(POST.*BBB.*)/,a){print a[1]}'
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
如果您确实希望输出行之间有空行,请添加-v ORS='\n\n'
。
答案 1 :(得分:0)
grep
是您要搜索的命令
grep -rHn "GET.*KEYWORD_A" -A1 /path/to/files | grep "POST.*KEYWORD_B"
我首先要grep包含KEYWORD_A
的行,并在匹配后追加一行,因为POST是在日志文件中的GET之后。然后搜索KEYWORD_B
-r greps recursively in a directory
-H prints the file name
-n prints the line number
答案 2 :(得分:0)
我用正则表达式的grep -P解决了这个问题,因为我从PHP中知道它,特别是使用-A来获得下一个n行。然后我用" |"过滤了结果。和grep -P再次