正则表达式模式,其中组可能不存在

时间:2015-04-28 13:30:54

标签: regex

我有一个需要在以下任何一行上匹配的RegEx模式:

10-10-15 15:16:41.1 Some Text here 
10-10-15 15:16:41.12 Some Text here 
10-10-15 15:16:41.123 Some Text here 
10-10-15 15:16:41 Some Text here 

我可以将前3个与下面的模式匹配:

(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})\.(?<milli>\d{0,3})))\s(?<Line>.*)

我如何匹配这一行(10-10-15 15:16:41这里有些文字),它没有毫秒但仍然在我的结果中得到一个空白值或0作为值?< / p>

由于

正如我所说,下面的每一行都将匹配:

10-10-15 15:16:41.123 Some text Here
10-10-15 15:16:41.12 Some Text here 
10-10-15 15:16:41.1 Some Text here 
10-10-15 15:16:41. Some Text here 

小组看起来像这样:

date    [0-18]  `10-10-15 15:16:41.`
day     [0-2]   `10`
month   [3-5]   `10`
year    [6-8]   `15`
time    [9-18]  `15:16:41.`
hour    [9-11]  `15`
minutes [12-14] `16`
seconds [15-17] `41`
milli   [18-18] ``
Line    [19-34] `Some Text here `

4 个答案:

答案 0 :(得分:2)

您可以使用以下(正则表达式的略微修改版本):

(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.\d{0,3})?))\s(?<logEntry>.*)

请参阅DEMO

说明:

  • <milli>部分设为可选..而不是.,因为它也匹配10-10-15 15:16:41123 Some Text here之类的字符串..

答案 1 :(得分:0)

使毫秒可选?

/^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$/

示例:

<?php

$strings = <<< LOL
10-10-15 15:16:41.1 Some Text here 
10-10-15 15:16:41.12 Some Text here 
10-10-15 15:16:41.123 Some Text here 
10-10-15 15:16:41 Some Text here 
LOL;

preg_match_all('/^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$/m', $strings , $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[0]); $i++) {

    $day = $matches[1][$i];
    $month = $matches[2][$i];
    $year = $matches[3][$i];
    $hours = $matches[4][$i];
    $minutes = $matches[5][$i];
    $seconds = $matches[6][$i];
    $ms = $matches[7][$i];
    $text = $matches[8][$i];


    echo "$day $month $year $hours $minutes $seconds $ms $text \n";
}

正则表达式演示:

https://regex101.com/r/aF9wN6/1

PHP演示:

http://ideone.com/1aEt2E

正则表达式解释:

^([\d]{2})-([\d]{2})-([\d]{2}|[\d]{4})\s+([\d]{2}):([\d]{2}):([\d]{2})\.?(\d+)?\s+(.*?)$

Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed) «^»
Match the regex below and capture its match into backreference number 1 «([\d]{2})»
   Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
      Exactly 2 times «{2}»
Match the character “-” literally «-»
Match the regex below and capture its match into backreference number 2 «([\d]{2})»
   Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
      Exactly 2 times «{2}»
Match the character “-” literally «-»
Match the regex below and capture its match into backreference number 3 «([\d]{2}|[\d]{4})»
   Match this alternative (attempting the next alternative only if this one fails) «[\d]{2}»
      Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
         Exactly 2 times «{2}»
   Or match this alternative (the entire group fails if this one fails to match) «[\d]{4}»
      Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{4}»
         Exactly 4 times «{4}»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 4 «([\d]{2})»
   Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
      Exactly 2 times «{2}»
Match the character “:” literally «:»
Match the regex below and capture its match into backreference number 5 «([\d]{2})»
   Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
      Exactly 2 times «{2}»
Match the character “:” literally «:»
Match the regex below and capture its match into backreference number 6 «([\d]{2})»
   Match a single character that is a “digit” (any decimal number in any Unicode script) «[\d]{2}»
      Exactly 2 times «{2}»
Match the character “.” literally «\.?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the regex below and capture its match into backreference number 7 «(\d+)?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match a single character that is a “digit” (any decimal number in any Unicode script) «\d+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, form feed) «\s+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 8 «(.*?)»
   Match any single character that is NOT a line break character (line feed) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line (at the end of the string or before a line break character) (line feed) «$»

答案 2 :(得分:0)

^(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)\.?(\d*)([a-zA-Z\s]+)

注意(\d*)即使为空也会返回该组。

Demo

答案 3 :(得分:0)

解决了这个问题。我需要以下模式:

(?<date>(?<day>\d{1,2})-(?<month>\d{1,2})-(?<year>(?:\d{4}|\d{2}))\s(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2})(?<milli>\.?\d{0,3})))\s(?<logEntry>.*)