Question

我很难过。我有两个正在执行单独需要的正则表达式，但我不确定如何让它们一起工作。

\b([a-zA-Z])?\d{5}\b正确地在可选的单个字母后跟5个数字的模式中查找字符串。

<a\s+(?:[^>]*?\s+)?href="([^"]*)"与锚标记中的网址匹配。

现在我想要匹配（用于替换目的）是在锚标记的URL内出现的5位数字（带或不带前面的字母）。

示例内容：

<a href="/uploads/2014/04/Draft-99990-Details.doc">Draft 99995 Details</a> <a href="/uploads/2014/04/01090-vs-G01010-series.pdf">01095 vs G01015 Series</a>

本文中应该有3个匹配，3个数字以0结尾，而不是以5结尾。

Answer 1

将任务分成两部分。首先，使用DOM解析器（如PHP的DOMDocument）检索所有href属性内容，然后使用正则表达式替换特定部分。这种方法优于单个正则表达式的优点是，即使将来标记的格式发生变化，它也不会中断。

$html = <<<HTML
<a href="/uploads/2014/04/Draft-99990-Details.doc">Draft 99995 Details</a>
<a href="/uploads/2014/04/01090-vs-G01010-series.pdf">01095 vs G01015 Series</a>
HTML;

$dom = new DOMDocument;
$dom->loadHTML($html);

$replacement = 'FOO';
$html = '';

foreach ($dom->getElementsByTagName('a') as $node) {
    $href = $node->getAttribute('href');
    $node->setAttribute('href', preg_replace('/([a-z])?\d{5}/i', $replacement, $href));
    $html .= $dom->saveHTML($node);
}

echo $html;

输出：

<a href="/uploads/2014/04/Draft-FOO-Details.doc">Draft 99995 Details</a>
<a href="/uploads/2014/04/FOO-vs-FOO-series.pdf">01095 vs G01015 Series</a>

Demo

Answer 2

这个expression应该可以解决问题。

(?:<a\s+(?:[^>]*)?href="|(?!^)\G)\K.*?([A-Z]?\d{5})(?=.*?")

说明：

(?:                         # BEGIN non-capturing group
    <a\s+(?:[^>]*)?href="   # Anchor tag up until the href attribute
  |                         # OR
    (?!^)\G                 # \G finds the end of the last match
)                           # END non-capturing group
\K                          # Start match over (remove anchor tag from match)
.*?                         # Lazily match the URL
([A-Z]?\d{5})               # Capture an optional letter followed by 5 digits
(?=                         # BEGIN look ahead
    .*?"                    # Lazily match to the end of the URL
)                           # END look ahead

这是通过g和i修饰符完成的，用于全局，不区分大小写的匹配。请注意，这只会匹配＆＃34;到捕获组的末尾（而不是URL的结尾）。这是因为我们必须使用\G来查找最后一场比赛的结束。如果我们match the entire URL，则\G会在网址末尾重新开始，我们会错过一些群组。

帽子提示Casimir's answer。

如何将两个正则表达式放在一起

2 个答案: