我正在尝试解析一些Apache错误日志
这些是我想要匹配的示例字符串
[Mon May 19 15:56:43 2014] [error] proxy: pass request body failed to 111.111.111.111:3000 (111.111.111.111) from 111.111.111.111 ()
和
[Mon May 19 15:58:00 2014] [error] (70007)The timeout specified has expired: proxy: prefetch request body failed to 111.111.111.111:3000 (111.111.111.111) from 111.111.111.111 ()
和
[Mon May 19 23:14:56 2014] [error] (70014)End of file found: proxy: prefetch request body failed to 111.111.111.111:3000 (111.111.111.111) from 111.111.111.111 ()
我正在使用这个正则表达式
^\[([^\]]+)\] \[([^\]]+)\] \(?([0-9]+)?\)?([a-zA-Z,\ ,\:]+)([0-9,\.\:]+) \(([0-9,\.]+)\) from ([0-9,\.]+) \(\)
这一点
\(?([0-9]+)?\)?
是使(70007)或(70014)可选。
Thu Apr 10 18:35:49 2014
error
70007
The timeout specified has expired: proxy: prefetch request body failed to
111.111.111.111:3100
111.111.111.111
111.111.111.111
这是找到(70007)或(70014)时的输出,如果它没有找到它会输出这个
Mon Jul 07 17:07:04 2014
error
proxy: pass request body failed to
111.111.111.111:3000
111.111.111.111
111.111.111.111
这会进入一个数组,我希望数组位置打开但是空或者像这样的
1. [1-25] `Mon Jul 07 17:07:04 2014`
2. [28-33] `error`
4. [36-41] ``
5. [35-70] `proxy: pass request body failed to `
5. [70-89] `111.111.111.111:3000`
6. [91-105] `111.111.111.111`
7. [112-125] `111.111.111.111`
答案
解决方案是
\(?([0-9|\ ]+)?\)?
而不是
\(?([0-9]+)?\)?
因为()和数字是可选的?我用了或者选择数字或完全匹配白色空间,现在返回
1. [1-25] `Mon May 19 15:56:43 2014`
2. [28-33] `error`
4. [35-70] `proxy: pass request body failed to `
5. [70-90] `111.111.111.111:3000`
6. [92-107] `111.111.111.111`
7. [114-129] `111.111.111.111`
你可以看到它跳过一个位置从2到4,除非它有(00000)。 希望这有助于其他任何人。
答案 0 :(得分:2)
您可以使用以下正则表达式来实现所需的输出。
preg_match('~\[([^]]+)\] # Match open/close brackets and capture date
\s+ # Match any white-space character
\[([^]]+)\] # Match open/close brackets and capture "error"
\s+ # Match any white-space character
(?:\(([^)]+)\))? # Match and capture optional group
(\D+) # Match and capture any character not a digit
\s+ # Match any white-space character
([\d:.]+) # Match and capture first set of digits
\s+ # Match any white-space character
\(([^)]+)\) # Match and capture digits inside parentheses
\D+ # Match any character thats not a digit
([\d.]+) # Match and capture last set of digits
~x', $string, $matches);
var_dump($matches);