我尝试解析这个字符串:
[Wed May 06 15:09:08.160122 2015] [proxy_fcgi:error] [pid 30987:tid 140285789038336] [client 192.168.56.1:39157] AH01071: Got error 'PHP message: PHP Fatal error: Undefined class constant 'self::TF_TEASER_LONG' in /var/www/foo/admin/server/php/UploadHandler.php on line 588\nPHP message: PHP Stack trace:\nPHP message: PHP 1. {main}() /var/www/foo/admin/server/php/index.php:0\nPHP message: PHP 2. UploadHandler->__construct() /var/www/foo/admin/server/php/index.php:14\nPHP message: PHP 3. UploadHandler->initialize() /var/www/foo/admin/server/php/UploadHandler.php:172\nPHP message: PHP 4. UploadHandler->post() /var/www/foo/admin/server/php/UploadHandler.php:187\nPHP message: PHP 5. UploadHandler->handle_file_upload() /var/www/foo/admin/server/php/UploadHandler.php:767\n', referer: http://foo.com/admin/module.php?id=29
我期望最后的比赛是:
1 -> Wed
2 -> May
3 -> 06
4 -> 15
5 -> 09
6 -> 08
7 -> 2015
8 -> proxy_fcgi:error
9 -> 192.168.56.1:39157
10 -> PHP Fatal error
11 -> Undefined class constant 'self::TF_TEASER_LONG'
12 -> /var/www/foo/admin/server/php/UploadHandler.php
13 -> 588
14 -> PHP message: PHP 1. {main}() /var/www/foo/admin/server/php/index.php:0\nPHP message: PHP 2. UploadHandler->__construct() /var/www/foo/admin/server/php/index.php:14\nPHP message: PHP 3. UploadHandler->initialize() /var/www/foo/admin/server/php/UploadHandler.php:172\nPHP message: PHP 4. UploadHandler->post() /var/www/foo/admin/server/php/UploadHandler.php:187\nPHP message: PHP 5. UploadHandler->handle_file_upload() /var/www/foo/admin/server/php/UploadHandler.php:767\n
15 -> http://foo.com/admin/module.php?id=29
我目前正处于这个正则表达式,并且已经无法理解基本原则:
/(\[(.*?)\])?((?<=\')(.*)(?=\'))?(, referer: (.*))*/g
(\[(.*?)\])
后面?这是一个测试用例:
答案 0 :(得分:0)
这个字符串,你给的是非常特别的,所以也许我的正则表达不会与它的每个&#34;兄弟&#34;匹配,但它看起来像这样:
/\[(\w+)\s(\w+)\s(\d{2})\s(\d{2}):(\d{2}):(\d{2})\.\d+\s(\d{4})\]\s*\[([^\]]+)\]\s\[[^\]]+\]\s\[client\s(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d+)\][^\']+\'[^:]+:\s([^:]+):\s+(.*?)\sin\s(.*?)\son\sline\s(\d+)\\n(.*?)referer:\s(.*)/g
https://regex101.com/r/dI9oO5/1
回答你的问题:
*
和+
运营商&#34;贪婪&#34;,这意味着默认情况下,他们会匹配尽可能多的字符。要更改此行为,您可以添加?,所以:.*?
表示:匹配每个字符,但尽快停止(不要贪婪。)
*
的贪婪(没有?)会让你消耗掉比你想要的更多的字符,剩下的就不多了。
在描述的预期结果中您希望将日期的每个部分都放在不同的变量中,因此这至少是(\[(.*?)\]){4}
无法工作的原因之一