PHP正则表达式匹配可选内容保留数组位置

时间:2014-07-08 14:53:38

标签: php arrays regex preg-match preg-match-all

我正在尝试解析一些Apache错误日志

这些是我想要匹配的示例字符串

[Mon May 19 15:56:43 2014] [error] proxy: pass request body failed to 111.111.111.111:3000 (111.111.111.111) from 111.111.111.111 ()

[Mon May 19 15:58:00 2014] [error] (70007)The timeout specified has expired: proxy: prefetch request body failed to 111.111.111.111:3000 (111.111.111.111) from 111.111.111.111 ()

[Mon May 19 23:14:56 2014] [error] (70014)End of file found: proxy: prefetch request body failed to 111.111.111.111:3000 (111.111.111.111) from 111.111.111.111 ()

我正在使用这个正则表达式

^\[([^\]]+)\] \[([^\]]+)\] \(?([0-9]+)?\)?([a-zA-Z,\ ,\:]+)([0-9,\.\:]+) \(([0-9,\.]+)\) from ([0-9,\.]+) \(\)

这一点

\(?([0-9]+)?\)?

是使(70007)或(70014)可选。

  1. [1-25] Thu Apr 10 18:35:49 2014
  2. [28-33] error
  3. [36-41] 70007
  4. [42-116] The timeout specified has expired: proxy: prefetch request body failed to
  5. [116-135] 111.111.111.111:3100
  6. [137-151] 111.111.111.111
  7. [158-172] 111.111.111.111
  8. 这是找到(70007)或(70014)时的输出,如果它没有找到它会输出这个

    1. [1-25] Mon Jul 07 17:07:04 2014
    2. [28-33] error
    3. [35-70] proxy: pass request body failed to
    4. [70-89] 111.111.111.111:3000
    5. [91-105] 111.111.111.111
    6. [112-125] 111.111.111.111
    7. 这会进入一个数组,我希望数组位置打开但是空或者像这样的

      1.  [1-25]  `Mon Jul 07 17:07:04 2014`
      2.  [28-33] `error`
      4.  [36-41] ``
      5.  [35-70] `proxy: pass request body failed to `
      5.  [70-89] `111.111.111.111:3000`
      6.  [91-105]    `111.111.111.111`
      7.  [112-125]   `111.111.111.111`
      

      答案

      解决方案是

      \(?([0-9|\ ]+)?\)?
      

      而不是

      \(?([0-9]+)?\)?
      

      因为()和数字是可选的?我用了或者选择数字或完全匹配白色空间,现在返回

      1.  [1-25]  `Mon May 19 15:56:43 2014`
      2.  [28-33] `error`
      4.  [35-70] `proxy: pass request body failed to `
      5.  [70-90] `111.111.111.111:3000`
      6.  [92-107]    `111.111.111.111`
      7.  [114-129]   `111.111.111.111`
      

      你可以看到它跳过一个位置从2到4,除非它有(00000)。 希望这有助于其他任何人。

1 个答案:

答案 0 :(得分:2)

您可以使用以下正则表达式来实现所需的输出。

preg_match('~\[([^]]+)\]             # Match open/close brackets and capture date
              \s+                    # Match any white-space character
              \[([^]]+)\]            # Match open/close brackets and capture "error"
              \s+                    # Match any white-space character
              (?:\(([^)]+)\))?       # Match and capture optional group
              (\D+)                  # Match and capture any character not a digit
              \s+                    # Match any white-space character
              ([\d:.]+)              # Match and capture first set of digits
              \s+                    # Match any white-space character
              \(([^)]+)\)            # Match and capture digits inside parentheses
              \D+                    # Match any character thats not a digit
              ([\d.]+)               # Match and capture last set of digits
            ~x', $string, $matches);

var_dump($matches);

Live Demo