使用PHP中的preg_match通过多个分隔符拆分字符串

时间:2016-10-25 15:01:23

标签: php regex preg-match

最多包含三个部分的字符串:WriterDirectorProducer。我们称之为"类别" 。每个类别由冒号分隔的两个部分组成:Label : Names,其中Label是提到的类别名称之一,Names是由斜杠分隔的名称列表。 E.g:

Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith

我想通过类别名称和带有preg_match函数的名称列表将字符串分成几部分。 Here是我到目前为止所做的:

$pattern = '/Writer : (?P<Writer>[\s\S]+?)Director : (?P<Director>[\s\S]+?)Producer : (?P<Producer>[\s\S]+)/';
$sentence = 'Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith';
preg_match($pattern, $sentence, $matches);

foreach($matches as $cat => $match) {
  // Do more
  // echo "<b>" . $cat . "</b>" . $match . "<br />";
}

如果字符串中只有三个类别,则脚本运行良好。如果至少缺少其中一个类别,则失败。

1 个答案:

答案 0 :(得分:0)

一种方法是使用众所周知的?量词创建可选组:

$pattern = '/^' .
  '(?:Writer *: *(?P<Writer>[^:]+))?' .
  '(?:Director *: *(?P<Director>[^:]+))?' .
  '(?:Producer *: *(?P<Producer>[^:]+))?' .
  '$/';
preg_match($pattern, $sentence, $matches);

其中(?:)创建non-capturing group。注意,输出数组将由数字位置索引和名称索引,例如:

Array
(
    [0] => Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith
    [Writer] => Jeffrey Schenck / Peter Sullivan / 
    [1] => Jeffrey Schenck / Peter Sullivan / 
    [Director] => Brian Trenchard-Smith / jack / 
    [2] => Brian Trenchard-Smith / jack / 
    [Producer] => smith
    [3] => smith
)

另一种方法是使用preg_match_all进行额外处理:

$pattern = '/(?<=:)[^:]+/';
if (preg_match_all($pattern, $sentence, $matches)) {
  $keys = ['Writer', 'Director', 'Producer'];
  for ($i = 0; $i < count($matches[0]); ++$i)
    // The isset() checks are skipped for clarity's sake
    $a[$keys[$i]] = $matches[0][$i];

  print_r($a);
}

其中(?<=:):字符的肯定lookbehind断言。在这种情况下,生成的数组将具有整洁的外观:

Array
(
    [Writer] =>  Jeffrey Schenck / Peter Sullivan / Director 
    [Director] =>  Brian Trenchard-Smith / jack / Producer 
    [Producer] =>  smith
)