我试图在这个gawk命令中理解一些javascript:
gawk 'function getip(rec) {
n=split(rec,a,"\"");
split(a[n-1],ip,",");
return ip[1]
}
$10 ~ /302/ && $6 ~ /POST/ && $7 ~ /^\/sso\/[pl]fe\/(rs|ui)\/login/ {
lfe_user_ip=getip($0);
user_path[lfe_user_ip]=user_path[lfe_user_ip]"_login-302"
}
/\/sso\/pfe\/rs\/profile\/customer/ && $6 ~ /PUT/ {
pfe_user_ip=getip($0);
if (user_path[pfe_user_ip] ~ /_login-302/) {
if ($10 ~ /200/) successful_redirect_conversion+=1;
else failed_redirect_conversion+=1;
}
} END {
print successful_redirect_conversion, failed_redirect_conversion
}'
用于分析的日志行在上面的awk上看起来像这样:
[09/Oct/2017:02:21:39 -0400] 10.222.11.23 10.222.11.23 - GET /sso/lfe/ui/login http-bio-8000-exec-27 5000 200 49929 24 ?templateId=https%253A%2F%2Fwww.cargive.ca%2Fservice%2FpostLoginProcessing.a%3Fredirect%3Ddefault%26rememberMe%3D1&sourceUrl=https%3A//www.cargive.ca/service/postLoginProcessing.a?redirect=default&rememberMe=1&authlvl= "unauthenticated" "10.222.11.23, 10.222.11.23,10.222.11.23"
对javascript和user_path变量中的split方法感到困惑
答案 0 :(得分:1)
我试图在这个gawk命令中理解一些javascript:
在这个脚本中没有javascript,它的纯awk
gawk ' # call gawk, by default awk default field separator is single space.
function getip(rec) {
# rec -> string
# a -> array
# "\"" -> separator
# split string rec into pieces, where separator being quote
# and store in array a,
# varibale n will hold count of it,
# meaning how many elements (in array a) it became after splitting
n=split(rec,a,"\"");
# a[n-1] -> second last element of an array
# ip -> array
# , -> being separator
# like above it will split string by comma
split(a[n-1],ip,",");
# return first element of an array ip
return ip[1]
}
# if 10th field/column contains 302 and
# 6th contains POST and
# 7th field starts with regex /sso/[pl]fe/rs|ui/login/
# it can be
# /sso/pfe/rs/login or /sso/lfe/rs/login
# /sso/pfe/ui/login or /sso/lfe/ui/login
$10 ~ /302/ && $6 ~ /POST/ && $7 ~ /^\/sso\/[pl]fe\/(rs|ui)\/login/ {
# variable lfe_user will have contents returned by getip function
# $0 -> current row/record/line
lfe_user_ip=getip($0);
# user_path -> array
# lfe_user_ip -> array key/index
# user_path[lfe_user_ip]"_login-302" -> previous content of such key
# and new string "_login-302" will be saved in array
# infact concatenation with existing values of array
user_path[lfe_user_ip]=user_path[lfe_user_ip]"_login-302"
}
# if line contains regex
# /sso/pre/rs/profile/customer and
# 6th field contains string PUT
/\/sso\/pfe\/rs\/profile\/customer/ && $6 ~ /PUT/ {
# variable pre_user_ip will have contents returned by function
pfe_user_ip=getip($0);
# if array user_path value, with index being content of pfe_user_ip
# contains with /_login-302/
if (user_path[pfe_user_ip] ~ /_login-302/) {
# if 10th field contains 200
# increment variable successful_redirect_conversion by 1
# else increment variable failed_redirect_conversion by 1
if ($10 ~ /200/) successful_redirect_conversion+=1;
else failed_redirect_conversion+=1;
}
} END {
# after reading everything
# print variables
print successful_redirect_conversion, failed_redirect_conversion
}'
从您的文件内容,这里是awk如何分割成字段,用 默认FS,单个空格。
Field-1 => $1 => [09/Oct/2017:02:21:39
Field-2 => $2 => -0400]
Field-3 => $3 => 10.222.11.23
Field-4 => $4 => 10.222.11.23
Field-5 => $5 => -
Field-6 => $6 => GET
Field-7 => $7 => /sso/lfe/ui/login
Field-8 => $8 => http-bio-8000-exec-27
Field-9 => $9 => 5000
Field-10 => $10 => 200
Field-11 => $11 => 49929
Field-12 => $12 => 24
Field-13 => $13 => ?templateId=https%253A%2F%2Fwww.cargive.ca%2Fservice%2FpostLoginProcessing.a%3Fredirect%3Ddefault%26rememberMe%3D1&sourceUrl=https%3A//www.cargive.ca/service/postLoginProcessing.a?redirect=default&rememberMe=1&authlvl=
Field-14 => $14 => "unauthenticated"
Field-15 => $15 => "10.222.11.23,
Field-16 => $16 => 10.222.11.23,10.222.11.23"