正则表达式替换HTML字符串?

时间:2014-03-10 14:56:28

标签: php regex

这是我的字符串:

<ol>
    <li>
        <a rel="nofollow" href="http://127.0.0.1/index.php/something?price=3%2C25"><span class="price">50,00€</span> - <span class="price">75,00€</span></a> (38)
    </li>
    <li>
        <a rel="nofollow" href="http://127.0.0.1/index.php/something?price=4%2C25"><span class="price">75,00€</span> - <span class="price">100,00€</span></a> (11)
    </li>
</ol>

我想替换

<span class="price">50,00€</span> - <span class="price">75,00€</span>

与“foobar”和

<span class="price">75,00€</span> - <span class="price">100,00€</span>

以“foobar2”为例。知道哪条线替换为什么的唯一方法是URL中的price = 3或price = 4部分。

因此,在替换之后,字符串应如下所示:

<ol>
    <li>
        <a rel="nofollow" href="http://127.0.0.1/index.php/something?price=3%2C25">foobar</a> (38)
    </li>
    <li>
        <a rel="nofollow" href="http://127.0.0.1/index.php/something?price=4%2C25">foobar2</a> (11)
    </li>
</ol>

我试过preg_replace,但它总是得到太多的字符串。想法?

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

如果您想根据某种键更改内容,我建议使用数组来保存值。

<?php

$foo_array = array(3 => 'foobar', 4 => 'foobar2');

现在,我只是添加了原始字符串,所以我有一些操作:

$string = '<ol>
    <li>
        <a rel="nofollow" href="http://127.0.0.1/index.php/something?price=3%2C25"><span class="price">50,00€</span> - <span class="price">75,00€</span></a> (38)
    </li>
    <li>
        <a rel="nofollow" href="http://127.0.0.1/index.php/something?price=4%2C25"><span class="price">75,00€</span> - <span class="price">100,00€</span></a> (11)
    </li>
</ol>';

然后,您可以使用类似的东西来替换字符串。

$new_string = preg_replace_callback('/(price=(3|4)([A-Z0-9%]+)">)(.*?)(<\/a>)/ms', function ($m) use ($foo_array) {return "$m[1]".$foo_array[$m[2]]."$m[5]";}, $string);

print $new_string;

或者,如果您想在评论中注意Marc B的建议,您可以使用PHP的DOM操作来完成工作。我不是这方面的专家,但我对它进行了尝试,这似乎与上面的REGEX解决方案输出相同(除了这会添加DOCTYPEhtmlbody标头在):

$dom_document = new DOMDocument(); // CREATE A NEW DOCUMENT
$dom_document->loadHTML($string); // LOAD THE STRING INTO THE DOCUMENT
$links = $dom_document->getElementsByTagName('a'); // PULL OUT THE LINKS OUT OF THE DOCUMENT

// LOOP THROUGH EACH LINK
foreach ($links AS $link) {

    // IF WE FIND A 3 OR A 4 AFTER price=, THEN, REPLACE THE TEXT OF 
    // - THE LINK (THE nodeValue) WITH THE ITEM FROM THE ARRAY
    if (preg_match('/price=(3|4)/ms', $link->getAttribute('href'), $m)) {
        $link->nodeValue = $foo_array[$m[1]];
    }

}

$new_string_2 = $dom_document->saveHTML(); // WRITE THE CHANGES TO A STRING

print $new_string_2;