Question

我正在尝试匹配httpd日志文件中每行的多个项目。这些行看起来像这样：

192.168.0.1 - - [06/Apr/2016:16:35:42 +0100] "-"  "100" "GET /breacher/gibborum.do?firstnumber=1238100121135&simple=1238100121135&protocol=http&_super=telco1 HTTP/1.1" 200 161 "-" "NING/1.0"
192.168.0.1 - - [06/Apr/2016:16:35:44 +0100] "-"  "00" "GET /breacher/gibborum.do?firstnumber=1237037630256&simple=1237037630256&protocol=http&_super=telco1 HTTP/1.1" 200 136 "-" "NING/1.0"
192.168.0.1 - - [06/Apr/2016:16:35:44 +0100] "-"  "00" "GET /breacher/gibborum.do?firstnumber=1238064400578&simple=1238064400578&protocol=http&_super=telco1 HTTP/1.1" 200 136 "-" "NING/1.0"

我正在尝试提取_super变量的数字，时间戳和值。到目前为止，我可以用这个来提取数字和时间戳：

 awk '{match ($0, /123([0-9]+)/, arr); print $4, arr[0]}'

请问如何在_super =变量的末尾提取值？

Answer 1

您可以像这样更改脚本:(添加gsub和$9）：

awk '{match ($0, /123([0-9]+)/, arr); gsub(/.*_super=/, "",$9); print $4, arr[0], $9}'

awk匹配多个正则表达式字符串和一行数字

1 个答案: