(preg_replace)正则表达式替换所有&amp;在<a href="">

时间:2016-08-05 11:19:19

标签: php regex preg-replace

I somehow can't get this to work: I have a simple string, for example:

<p>Foo &amp; Bar</p> // <-- this should still be &amp;
<a href="http://test.com/?php=true&amp;test=test&amp;p=p"> // <- This string should only be affected and be changed to &
<div> Yes &uuml; No</div> // <-- This should still be &uuml;

<a href="http://mycoolpage.com/?page=1&amp;fun=true&amp;foo=bar&amp;yes=no">

Now I want to replace all the &amp; with only & with preg_replace and I tried to create a regex for this, but somehow I can't get it to work.

This is how far I've come, it finds only the last &amp; and also matches the whole string before it and fails to find the other. What am I doing wrong?

(?>=href\=\").*?(&amp;)(?=\")

Edit: It is not possible to use htmlentities_decode or htmlspecialchars_decode, as there is other Code that would get affected.

2 个答案:

答案 0 :(得分:0)

我在不深入了解PHP regex API的情况下看到的自然方式是将字符串与模式匹配,直到没有更多匹配,例如当替换最后一个_NNNN<data>NNNN<data>NNNN<data> ^ ^ ^ 1 2 3 where each "NNNN" is an unsigned 32-bit integer, and <data> consists of a number of bytes equal to the preceding NNNN value. The corresponding DataArray elements must have format="appended" and offset attributes equal to the following: 1.) offset="0" 2.) offset="(4+NNNN1)" 3.) offset="(4+NNNN1+4+NNNN2)" 时,将不再有匹配

&amp;

结果:

$str = "<p>Foo &amp; Bar</p> // <-- this should still be &amp; <a href=\"http://mycoolpage.com/?page=1&amp;fun=true&amp;foo=bar&amp;yes=no\">"; $pattern = "/(href=\".*?)(&amp;)(.*?\">)/"; while (preg_match_all($pattern, $str, $matches)) { $left = $matches[1][0]; // e.g. href="http://....?page=1 $before = substr($str, 0, strpos($str, $left)); // <p>Foo &amp; .... $index = strlen($before) + strlen($left); $str = substr_replace($str, "&", $index, strlen("&amp;")); } var_dump($str);

答案 1 :(得分:0)

WiktorStribiżew的评论有效:

  

或者更难的方式:http://ideone.com/ADku3b

<?php
$s = '<a href="http://myurl.com/?page=1&amp;fun=true&amp;foo=bar&amp;yes=no">';
echo preg_replace_callback('~(<a\b[^>]*href=)(([\'"]).*?\3|\S+)([^>]*>)~', function ($m) {
  return $m[1] . html_entity_decode($m[2]) . $m[4];
}, $s);