gawk命令匹配数据

时间:2012-08-08 00:20:29

标签: unix sed command gawk

原始日志文件示例:

  

“GET   /dynamic_preroll_playlist.fmil?domain=13nwuc&width=480&height=360&imu=medrect&pubchannel=filmannex&ad_unit=category_2&sdk_ver=2.4.1.3&embeddedIn=http%3A%2F%2Fwww.filmannex.com% 2Fmovie%2Fend的最隧道%2F20872&安培; sdk_url = HTTP%3A%2F%2Fstatic2.filmannex.com%2Fflash%2F&安培;的视= 10,261,971,0,981,10,10,261   HTTP / 1.1“,200,201,1516,16363,   “http://static2.filmannex.com/flash/yume_ad_library.swf”   pl.networks.com,“Mozilla / 4.0(兼容; MSIE 7.0; Windows NT 6.0;   FunWebProducts的; GTB7.3; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729;   .NET CLR 3.0.30618; FunWebProducts的; .NET4.0C)”,   “24_100_150_188_jZKFKQQjdRNM6e”   “0rO0ABXd8AAAACgAAASQAAAaLAAAGiwAAASgAAAaLAAAGiwAAAVoAAAaLAAAGiwAAAVkAAAaKAAAGiwAAAdwAAAaKAAAGiwAAAhIAAAaKAAAGiwAAAhUAAAaKAAAGiwAAAhYAAAaKAAAGiwAAAhsAAAaKAAAGiwAAAiwAAAaKAAAGiw **”,   “ - ”,“ - ”,“@ YD_1; 233_2739”, - ,“ - ”,“24.100.150.188”,“199.127.205.6”

所需的输出是视口的第5和第6个字段:

981 10

我得到了下面的gawk代码,它产生了第3和第4个字段:

910 0

gawk 'match($0, /&viewport=[0-9]+,[0-9]+,([0-9]+),([0-9]+)/, m){print m[1], m[2]}' filename

任何人都可以帮我解决这个问题吗?只需对gawk命令进行一点改动即可获取视口的第5和第6个参数?

有什么想法吗?非常感谢提前:))

2 个答案:

答案 0 :(得分:1)

此命令将执行您想要的操作:

awk '{split($0,a,"viewport=");split(a[2],b,",");print b[5],b[6]}' filename

给出:

981 10

如果 您确实需要修改后的gawk命令

gawk 'match($0, /&viewport=[0-9]+,[0-9]+,([0-9]+),([0-9]+),([0-9]+),([0-9]+),([0-9]+)/, m){print m[3], m[4]}' filename

也会奏效。

我认为第一种解决方案更清晰/更清晰,也更容易修改。

答案 1 :(得分:0)

这可能适合你(GNU sed):

sed 's/.*&viewport=\(\([^,]*\),\([^,]*\),\)\{3\}.*/\2 \3/' file