正则表达式解析CustomLog格式(PHP)

时间:2016-05-26 19:05:15

标签: php regex apache logging

我试图以这种格式解析CustomLog格式:

LogFormat "%v %{X-Forwarded-For}i %h %l %u %t \"%r\" %>s %b" MyCustomLog

这是条目的外观 - 请注意,有一个逗号分隔了X-Forwarded-For标头中传递的IP。

my.server.com 24.24.24.3, 1.2.3.4 1.2.3.5 - - [18/May/2016:02:57:25 -0400] "GET /veer/eye?params=1&are=2&right=3&here=4 HTTP/1.1" 200 146351

我想捕获以下字段:

  • x-forward-for IP' s(逗号分隔)
  • 远程主机名
  • 远程日志名称(可能是 - )
  • 远程用户(可能是 - )
  • []块
  • 中的时间戳
  • 请求网址(在引号中)
  • 响应大小(最后一个值)

我对正则表达式有点生疏 - 至少在我认为我需要使用的负向前瞻的意义上?

非常感谢帮助!

1 个答案:

答案 0 :(得分:1)

这是一个应该适合您的更完整的模式。我更彻底地将所有内容作为小组的一部分打破,甚至为小组添加名称。它匹配您的问题中的示例和评论中的示例。

演示:https://3v4l.org/jMKFL

<?php
$pattern = '/(?P<hostname>[\w\.]+) '
         . '(?P<forwardedFor>(?:[\d\.]+, )*(?:[\d\.]+)|-) '
         . '(?P<remoteHostname>[\d\.]+) '
         . '(?P<remoteLogname>[^\s]+) '
         . '(?P<remoteUsername>[^\s]+) '
         . '\['
            . '(?P<requestDate>[^\]]+)'
         . '\] '
         . '"'
            . '(?P<method>\w+) '
            . '(?P<uri>[^\s]+) '
            . '(?<httpVersion>[^\"]+)'
         . '" '
         . '(?P<responseStatus>\d+) '
         . '(?P<responseSize>\d+)/';

$test = 'my.server.com 24.24.24.3, 1.2.3.4 1.2.3.5 - - [18/May/2016:02:57:25 -0400] "GET /veer/eye?params=1&are=2&right=3&here=4 HTTP/1.1" 200 146351';
$test2 = 'qa-test.test.com - 80.82.65.120 - - [18/May/2016:00:30:20 -0400] "GET // HTTP/1.1" 404 198';

preg_match($pattern, $test, $matches);
print_r($matches);

preg_match($pattern, $test2, $matches);
print_r($matches);

输出:

Array
(
    [0] => my.server.com 24.24.24.3, 1.2.3.4 1.2.3.5 - - [18/May/2016:02:57:25 -0400] "GET /veer/eye?params=1&are=2&right=3&here=4 HTTP/1.1" 200 146351
    [hostname] => my.server.com
    [1] => my.server.com
    [forwardedFor] => 24.24.24.3, 1.2.3.4
    [2] => 24.24.24.3, 1.2.3.4
    [remoteHostname] => 1.2.3.5
    [3] => 1.2.3.5
    [remoteLogname] => -
    [4] => -
    [remoteUsername] => -
    [5] => -
    [requestDate] => 18/May/2016:02:57:25 -0400
    [6] => 18/May/2016:02:57:25 -0400
    [method] => GET
    [7] => GET
    [uri] => /veer/eye?params=1&are=2&right=3&here=4
    [8] => /veer/eye?params=1&are=2&right=3&here=4
    [httpVersion] => HTTP/1.1
    [9] => HTTP/1.1
    [responseStatus] => 200
    [10] => 200
    [responseSize] => 146351
    [11] => 146351
)
Array
(
    [0] => test.test.com - 80.82.65.120 - - [18/May/2016:00:30:20 -0400] "GET // HTTP/1.1" 404 198
    [hostname] => test.test.com
    [1] => test.test.com
    [forwardedFor] => -
    [2] => -
    [remoteHostname] => 80.82.65.120
    [3] => 80.82.65.120
    [remoteLogname] => -
    [4] => -
    [remoteUsername] => -
    [5] => -
    [requestDate] => 18/May/2016:00:30:20 -0400
    [6] => 18/May/2016:00:30:20 -0400
    [method] => GET
    [7] => GET
    [uri] => //
    [8] => //
    [httpVersion] => HTTP/1.1
    [9] => HTTP/1.1
    [responseStatus] => 404
    [10] => 404
    [responseSize] => 198
    [11] => 198
)