复杂正则表达式 - 将HTML单词更改为搜索该单词的链接

时间:2017-03-20 09:09:31

标签: html regex qregularexpression

我需要1000多页才能将某些单词转换为包含所述单词的链接。

基本上想知道如何使用正则表达式 做某事......

变化。

<span class="TagsTStyle">PRODUCTS / SERVICES:</span>
<span class="TagsStyle">ACCOUNTANT, TAX, FINANCIAL PLANNING, GST, BAS, TAX RETURNS</span>

成..

<span class="TagsTStyle">PRODUCTS / SERVICES:</span>
<span class="TagsStyle"><a href="../search.php?searchQuery=ACCOUNTANT">ACCOUNTANT</a>, <a href="../search.php?searchQuery=TAX">TAX</a>, <a href="../search.php?searchQuery=FINANCIA+PLANNING">FINANCIAL PLANNING</a>, <a href="../search.php?searchQuery=GST">GST</a>, <a href="../search.php?searchQuery=BAS">BAS</a>, <a href="../search.php?searchQuery=TAX+RETURNS">TAX RETURNS</a></span>

我有1000多页,每页的单词都不同。

需要在所有网页上关联的关键字都在名为span

的范围内
<span class="TagsStyle">

每个和所有单词,短语都以span中的逗号分隔。

我很确定使用正则表达式可以实现,但是这个对我来说有点过于复杂,无法解决这个问题。

页面中使用的html示例是:

<div align="center">
      <span class="CatTStyle">Category:</span>
      <span class="CatStyle">PHYSIOTHERAPY</span>
      <br>
      <br>
      <span class="BusTStyle">Business Name:</span>
      <span class="BusStyle">Physio</span>
      <br>
      <span class="PhTStyle">Phone:</span>
      <span class="PhStyle"><a onclick="_gaq.push(['_trackEvent', 'Phone', 'Click to Call', document.title])" href="tel:555 5555">555 5555 <img src="img/call.png"></a></span>
      <br>
      <span class="AddrTStyle">Address:</span>
      <span class="AddrStyle">1 Street Rd, Town, Country</span>
      <br>
      <span class="EmlTStyle">Email:</span>
      <span class="EmlStyle"><a onclick="_gaq.push(['_trackEvent', 'Email', 'Click to Email', document.title])" href="mailto:email@email.com">email@email.com</a></span>
      <br>
      <br>
      <span class="WsTStyle">Website:</span>
      <span class="WsStyle"><a onclick="_gaq.push(['_trackEvent', 'Website', 'Click to Website', document.title])" href="http://www.webiste.com">www.website.com</a></span>
      <br>
      <br>
      <span class="TagsTStyle">PRODUCTS / SERVICES:</span>
      <span class="TagsStyle">PHYSIOTHERAPY, BACK PAIN, SPINE INJURY</span>
      <br>
    </div>
<script async type="text/javascript">
    if ($(window).width() > 800) {document.write("</td><td align='center' valign='top' width='350'>");}
    if ($(window).width() < 800) {document.write("</td></tr><tr><td align='center' valign='top' width='350'>");} 
  </script> 
    <br>
    <div id="map" align="left" style="text-align:left;"></div>
    <script type="text/javascript">
      var address='1 Street Rd, Town, Country';
      var map = new google.maps.Map(document.getElementById('map'), {
        mapTypeId: google.maps.MapTypeId.TERRAIN,
        zoom: 15
      });
      var geocoder = new google.maps.Geocoder();
      geocoder.geocode({
        'address': address
      },
      function(results, status) {
        if(status == google.maps.GeocoderStatus.OK) {
          new google.maps.Marker({
          position: results[0].geometry.location,
          map: map
        });
        map.setCenter(results[0].geometry.location);
        }
      });
    </script>

但请注意,我只希望为<span class="TagsStyle">及其结束范围</span>之间的每个逗号分隔值创建链接

1 个答案:

答案 0 :(得分:1)

更新#1

QRegularExpression implements Perl-compatible regular expressions以来,您可以从匹配重置令牌\K\G assertion中受益:

(<span\b[^"]+class="TagsStyle"[^>]*>|(?!\A)\G)([^,<]+)(,?\s*)

替换字符串:

\1<a href="../search.php?searchQuery=\2">\2</a>\3

Live demo