Question

我的数据库中存储了html内容，我需要做的是获取该内容并匹配锚标记，并将该锚标记替换为任何字符串。

假设我的wordpress数据库中存储了以下html。

      <h3>Complications</h3>
      <p><strong>The three most common serious gastric sleeve complications</strong> are:</p>
      <ul>
         <li>
         <a href="https://insights.ovid.com/pubmed?pmid=28938270" target="_blank">3</a>
            <span><a href="javascript:;" class="list_expand">Staple line leaks</a> -  2.1% of patients on average (between 1.09% and 4.66%, depending on the study) experience staple line leaks (<a href="#reference-box">9</a>) (<a href="#reference-box">10</a>)</span>
            <div class="list_expand_content blockquote"></div>
         </li>
         <li>
            <span><a href="javascript:;" class="list_expand">Bleeding</a> - 1.2% of patients (<a href="#reference-box">11</a>)</span>
            <div class="list_expand_content blockquote"></div>
         </li>
         <li>
            <span><a href="javascript:;" class="list_expand">Stenosis/Strictures</a> -  0.6% of patients (<a href="#reference-box">12</a>)</span>
            <div class="list_expand_content blockquote"></div>
         </li>
      </ul>

我需要做的就是匹配

<a anthing goes here>[0-999]</a>

，并用短代码替换该锚定标签，例如[ref link ='包装在每个锚定标签内的链接'number ='起始和结束锚定标签之间包裹的编号']。

我编写了以下代码以匹配值并获取值。

preg_match_all('/<a[^>]+>(\d{1,3})<\/a>/',$content,$matches, PREG_PATTERN_ORDER);

但是用数据库中的短代码替换该值呢？

Answer 1

运行以下正则表达式：<a[^>]+href="([^"]+?)"[^>]+>(\d{1,3})<\/a>

使用此替换：[ref link='$1' number='$2']

如您所见，它将替换为：

<a href="https://insights.ovid.com/pubmed?pmid=28938270" target="_blank">3</a>

与此：

[ref link='https://insights.ovid.com/pubmed?pmid=28938270' number='3']

您可以阅读有关捕获组和后向引用here的信息。

Answer 2

我的答案就是这样。

$output = preg_replace_callback('/<a([^>]+)>\d{1,3}<\/a>/',function($matches) { return '[ref '.$matches[1].']'; }, $content);

如何使用正则表达式匹配开始和结束标记以及它们之间的任何内容（包括空格）？

2 个答案: