Question

我有一个书页的文本，可能在字符串末尾有脚注，如下例所示：

والخاتِم بكسر التاء اسم فاعل، فكأنه قد جاء آخر الرسل، والخاتَم بفتح التاء اسم آلة، كأنه قد ختمت به الرسالة.
__________

(1) - سورة الأحزاب آية : 43.
(2) - سورة البقرة آية : 157.
(3) - سورة الأنعام آية : 17.
(4) - سورة الكهف آية : 19.

我在示例中的含义以及本例中的特定字符是Kashidas _（它不是短划线-），在拉丁语中，它称为下划线。我需要得到的是匹配该行下的四行或任意行数。

我试过的只是为了匹配该行下的第一行：/_.*\n*(.*)/gum，这是demo。获得所有这些的唯一方法是重复模式部分\n*(.*) n次等于脚注中的行数，即四次，就示例情况而言，这不是像{{3 }}

Answer 1

您可以在此处使用\G锚点：

preg_match_all('~(?:\G(?!^)|_)\R+\K[^\n]+~', $str, $matches);
print_r($matches[0]);

eval.in

Answer 2

基本上它并不容易捕捉线条，然后每场比赛。但你能做的是在线后捕捉所有内容，然后再次匹配每一行。

你可以做到这一点：

 /_{4,}.+/gums
 /(\(.*?\.)*/gums

我希望这对你来说已经足够了。

Answer 3

我刚刚成功测试了这个：

$text = "_________\r\n\r\nLine 1\r\nLine 2\r\nLine 3\r\n";
$matches = array();
$pattern = '/_+\r\n\r\n(.+)/s'; // s to have . match newlines.
                                // Change \r\n to \n if appropriate

// Extract all footnotes
preg_match($pattern, $text, $matches);
$footnotes = $matches[1]; // $matches[0] is the whole matched string,
                          // $matches[1] is the part within ()
$matches = array();
$pattern = '/(.+)/'; // Don't match newlines here

// Extract individual footnotes
preg_match_all($pattern, $footnotes, $matches);
foreach ($matches[0] as $match) { // preg_match_all returns multi-dimensional array
    // Do something with each footnote
}

正则表达式匹配新行包含一行一行只包含特定字符

3 个答案: