PHP Regex找到模式并包装在锚标签中

时间:2017-02-26 12:29:36

标签: php regex pattern-matching preg-replace preg-match

我有一个电影片名和发行年份的字符串。我希望能够检测标题(年)模式,如果匹配,则将其包装在锚标签中。

包装很容易。但如果我不知道电影的名称是什么,是否可以写一个正则表达式来匹配这种模式?

示例:

ImeMode.Disable

因此模式将始终为$str = 'A random string with movie titles in it. Movies like The Thing (1984) and other titles like Captain America Civil War (2016). The movies could be anywhere in this string. And some movies like 28 Days Later (2002) could start with a number.'; (以大写字母开头),并以Title结尾。

这是我到目前为止所得到的:

(Year)

这目前无效。根据我的理解,这是应该做的:

if(preg_match('/^\p{Lu}[\w%+\/-]+\([0-9]+\)/', $str)){ error_log('MATCH'); } else{ error_log('NO MATCH'); }

^\p{Lu} //match a word beginning with an uppercase letter

[\w%+\/-] //with any number of characters following it

我在哪里错了?

2 个答案:

答案 0 :(得分:2)

以下 regex 应该这样做:

(?-i)(?<=[a-z]\s)[A-Z\d].*?\(\d+\)

<强>解释

  • (?-i)区分大小写
  • (?<=[a-z]\s)后视任何小写字母和空格
  • [A-Z\d]匹配大写字母或数字
  • .*?匹配任何字符
  • \(\d+\)匹配任何数字,包括括号

<强> DEMO

<强> PHP

<?php
$regex = '/(?-i)(?<=[a-z]\s)[A-Z\d].*?\(\d+\)/';
$str = 'A random string with movie titles in it.
       Movies like The Thing (1984) and other titles like Captain America Civil War (2016).
       The movies could be anywhere in this string.
       And some movies like 28 Days Later (2002) could start with a number.';
preg_match_all($regex, $str, $matches);
print_r($matches);
?>

答案 1 :(得分:0)

这个正则表达式完成了这项工作:

~(?:[A-Z][a-zA-Z]+\s+|\d+\s+)+\(\d+\)~

<强>解释

~               : regex delimiter
  (?:           : start non capture group
    [A-Z]       : 1 capital letter, (use \p{Lu} if you want to match title in any language)
    [a-zA-Z]+   : 1 or more letter,  if you want to match title in any language(use \p{L})
    \s+         : 1 or more spaces
   |            : OR
    \d+         : 1 or more digits
    \s+         : 1 or more spaces
  )+            : end group, repeated 1 or more times
  \(\d+\)       : 1 or more digits surrounded by parenthesis, (use \d{4} if the year is always 4 digits)
~               : regex delimiter

<强>实施

$str = 'A random string with movie titles in it. 
Movies like The Thing (1984) and other titles like Captain America Civil War (2016). 
The movies could be anywhere in this string. 
And some movies like 28 Days Later (2002) could start with a number.';

if (preg_match_all('~(?:[A-Z][a-zA-Z]+\s+|\d+\s+)+\(\d+\)~', $str, $match)) {
    print_r($match);
    error_log('MATCH');
}
else{
    error_log('NO MATCH');
}

<强>结果:

Array
(
    [0] => Array
        (
            [0] => The Thing (1984)
            [1] => Captain America Civil War (2016)
            [2] => 28 Days Later (2002)
        )

)
MATCH