用于nginx日志的java正则表达式

时间:2012-06-14 09:33:58

标签: ruby regex nginx rlike

我希望使用reqular表达式脚本转换我的nginx日志,如下所示:

原始日志:

07.21.99.178 - - [01/Jun/2012:12:06:23 +0530] "GET /api?playSessionId=live_21_bc206d95-113f-4b49-989b-7dff77af51c410.190.217.2111338532565422 HTTP/1.1" 200 71 "-" "Jakarta Commons-HttpClient/3.1"

我想将playSessionId作为输出,我使用了以下脚本:

#!/usr/bin/env ruby

mon={"Jan" => '01',"Feb" => '02',"Mar" => '03',"Apr" => '04',"May" => '05',"Jun" =>        '06',"Jul" => '07',"Aug" => '08',"Sep" => '09',"Oct" => '10',"Nov" => '11',"Dec" => '12'}

STDIN.each_line do |line|
if line =~ /([\d+|\.]+) (\d+)\/(\w+)\/(\d+):(\d+):\d+:\d+ \+\d+] "GET \/api\?playSessionId=(^&*)/
d = "#{$3}-#{mon$2}-#{$1}"
h = $4
pid = $5
puts "#{d}\t#{h}\t#{pid}"
end
 end

但这似乎不起作用:( 有人可以告诉我这个java正则表达式,以便我可以在蜂巢上起诉吗?

2 个答案:

答案 0 :(得分:0)

我想你想要这个:

/([\d+|\.]+) (\d+)\/(\w+)\/(\d+):(\d+):\d+:\d+ \+\d+] "GET \/api\?playSessionId=([^ &]+)/

您可能希望为此作业使用标准的unix工具(grep + sed):

grep 'playSessionId=' foo.log | sed 's/^.*playSessionId=\([^ ]*\).*$/\1/'

答案 1 :(得分:0)

我认为你的正则表达式太复杂了。这将完成这项工作:

playSessionId=(.*)\s