正则表达式匹配,仅提取所需的字符串段

时间:2013-10-23 15:50:24

标签: php regex

我正在尝试从字符串中提取三个段。由于我对正则表达式不是特别擅长,我认为我所做的事情可能会做得更好。

我想提取以下字符串的粗体部分:

  

SOMETEXT: ANYTHING_HERE (旧= ANYTHING_HERE ,   新= ANYTHING_HERE

一些例子可能是:

  

ABC:Some_Field(Old =,New = 123)

     

ABC:Some_Field(Old = ABCde,New = 1234)

     

ABC:Some_Field(Old = Hello World,New = Bye Bye World)

所以上面会返回以下匹配:

$matches[0] = 'Some_Field';
$matches[1] = '';
$matches[2] = '123';

到目前为止,我有以下代码:

preg_match_all('/^([a-z]*\:(\s?)+)(.+)(\s?)+\(old=(.+)\,(\s?)+new=(.+)\)/i',$string,$matches);

上面的问题是它为字符串的每个单独段返回一个匹配项。我不知道如何使用正则表达式确保字符串是正确的格式而不捕获并存储匹配(如果有意义的话)?

所以,我的问题,如果还不清楚,我怎么能从上面的字符串中只检索我想要的段?

5 个答案:

答案 0 :(得分:1)

您不需要preg_match_all。您可以使用此preg_match来电:

$s = 'SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE1, New=ANYTHING_HERE2)';
if (preg_match('/[^:]*:\s*(\w*)\s*\(Old=(\w*),\s*New=(\w*)/i', $s, $arr))
   print_r($arr);

输出:

Array
(
    [0] => SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE1, New=ANYTHING_HERE2
    [1] => ANYTHING_HERE
    [2] => ANYTHING_HERE1
    [3] => ANYTHING_HERE2
)

答案 1 :(得分:1)

if(preg_match_all('/([a-z]*)\:\s*.+\(Old=(.+),\s*New=(.+)\)/i',$string,$matches)) {
    print_r($matches);
}

示例:

$string = 'ABC: Some_Field (Old=Hello World,New=Bye Bye World)';

将匹配:

Array
(
    [0] => Array
        (
            [0] => ABC: Some_Field (Old=Hello World,New=Bye Bye World)
        )

    [1] => Array
        (
            [0] => ABC
        )

    [2] => Array
        (
            [0] => Hello World
        )

    [3] => Array
        (
            [0] => Bye Bye World
        )

)

答案 2 :(得分:1)

问题在于你使用的括号超出了你的需要,从而捕获了比你想要的更多的输入段。

例如,每个(\s?)+段应该只是\s*

您正在寻找的正则表达式是:

[^:]+:\s*(.+)\s*\(old=(.*)\s*,\s*new=(.*)\)

在PHP中:

preg_match_all('/[^:]+:\s*(.+)\s*\(old=(.*)\s*,\s*new=(.*)\)/i',$string,$matches);

可在此处找到有用的工具:http://www.myregextester.com/index.php

此工具提供了一个“Explain”复选框(以及一个“PHP”复选框和“i”标志复选框,您可以选择它),它也提供了正则表达式的完整说明。对于后代,我也包括下面的解释:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?i-msx:                 group, but do not capture (case-insensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  [^:]+                    any character except: ':' (1 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  \(                       '('
----------------------------------------------------------------------
  old=                     'old='
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  ,                        ','
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  new=                     'new='
----------------------------------------------------------------------
  (                        group and capture to \3:
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \3
----------------------------------------------------------------------
  \)                       ')'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

答案 3 :(得分:1)

比较简单的事情^ _ ^

[:=]\s*([\w\s]*)

Live DEMO

答案 4 :(得分:0)

:\s*([^(\s]+)\s*\(Old=([^,]*),New=([^)]*)

Live demo

也请告诉您是否需要解释。