PHP preg_match可选组

时间:2019-07-10 14:01:17

标签: php regex

我写了一个正则表达式:

(^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) (\"{1}.+\"{1}) ([0-9]+) ([0-9-]+)

匹配以下字符串:

141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233

并使用php preg_match。

例如,当我从字符串中删除第一部分141.243.1.172时,preg_match返回我:

array(6
 0  =>  [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
 1  =>  // correctly empty
 2  =>  [29:23:53:25]
 3  =>  "GET /Software.html HTTP/1.0"
 4  =>  200
 5  =>  233
 )

其中索引1正确为空。 但是,如果我从字符串[29:23:53:25]中删除,则会从preg_match得到一个空数组。如何获得与上述相同的结果,而仅使相关索引为空而不是全部?

2 个答案:

答案 0 :(得分:1)

对于由于.*而起作用的第一部分。如果还希望删除第二部分,则可以将两个组都设为可选,而将第一组设为非贪心。将该空间也移到第二组中。

请注意,您不必转义双引号,并且量词{1}是多余的,因此可以省略。

在第一个匹配项之后只有一个双引号,但是要防止可能的过度匹配,可以使该匹配项也不要贪心,或者使用否定的字符类("[^"]+")来防止不必要的回溯。

(^.*?)?(\[[0-9]+:[0-9]+:[0-9]+:[0-9]+\] )?(".+?") ([0-9]+) ([0-9-]+)

Regex demo

例如

$strings = [
    '141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233',
    '[29:23:53:25] "GET /Software.html HTTP/1.0" 200 233',
    '"GET /Software.html HTTP/1.0" 200 233'
];

$pattern = '/(^.*?)?(\[[0-9]+:[0-9]+:[0-9]+:[0-9]+\] )?(".+?") ([0-9]+) ([0-9-]+)/';

foreach ($strings as $string) {
    preg_match($pattern, $string, $matches);
    print_r($matches);
}

结果

Array
(
    [0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => 141.243.1.172 
    [2] => [29:23:53:25] 
    [3] => "GET /Software.html HTTP/1.0"
    [4] => 200
    [5] => 233
)
Array
(
    [0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => 
    [2] => [29:23:53:25] 
    [3] => "GET /Software.html HTTP/1.0"
    [4] => 200
    [5] => 233
)
Array
(
    [0] => "GET /Software.html HTTP/1.0" 200 233
    [1] => 
    [2] => 
    [3] => "GET /Software.html HTTP/1.0"
    [4] => 200
    [5] => 233
)

Php demo

答案 1 :(得分:0)

将正则表达式更改为此

((^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) )?(\"{1}.+\"{1}) ([0-9]+) ([0-9-]+)

141.243.1.172 [29:23:53:25]“ GET /Software.html HTTP / 1.0” 200233

结果将是

Array
(
    [0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => 141.243.1.172 [29:23:53:25] 
    [2] => 141.243.1.172
    [3] => [29:23:53:25]
    [4] => "GET /Software.html HTTP/1.0"
    [5] => 200
    [6] => 233
)

,对于[29:23:53:25]“ GET /Software.html HTTP / 1.0” 200 233

结果将是

Array
(
    [0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
    [1] => [29:23:53:25] 
    [2] => 
    [3] => [29:23:53:25]
    [4] => "GET /Software.html HTTP/1.0"
    [5] => 200
    [6] => 233
)