PHP RegEx:如何从句子中提取某些单词?

时间:2017-04-19 14:35:11

标签: php regex

我有一个类似的Wikitext:

  

David Fincher执导。由Jim Uhls撰写。基于小说的   Chuck Palahniuk。

我正在尝试制作仅提取 David Fincher Jim Uhls 的正则表达式,这两个名称将根据网址而有所不同。我做了以下Regex并且确实有效(在替换不需要的文本之后),还有更好的方法吗?

/(Directed by)([\w\s]+). (Written by)([\w\s]+). /g

2 个答案:

答案 0 :(得分:1)

(?:Directed|Written)\s*by\s这将匹配Directed byWritten by

\K会弃掉以前的比赛。

[^\.]+这将匹配字符.点(不包括.dot)。

正则表达式: /(?:Directed|Written)\s*by\s+\K[^.]+/g

Regex demo

<?php

ini_set('display_errors', 1);
$string='Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.';
preg_match_all("/(?:Directed|Written)\s*by\s+\K[^.]+/", $string,$matches);
print_r($matches);

<强>输出:

Array
(
    [0] => Array
        (
            [0] => David Fincher
            [1] => Jim Uhls
        )

)

答案 1 :(得分:0)

这是我能想到的最干净的事情:

<?php

// https://regex101.com/r/dxft9p/1

$string = 'Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.';
$regex = '#Directed by (?<director>\w+\s?\w+). Written by (?<author>\w+\s?\w+)#';
preg_match($regex, $string, $matches);

echo 'Director - '.$matches['director']."\n";
echo 'Author - '.$matches['author'];

请参阅此处查看工作示例https://3v4l.org/AuDL0

在括号中使用(?<somelabel> blah)时,可以创建命名捕获组。非常方便!