str_replace()以完整的网页为主题

时间:2015-11-11 21:00:40

标签: php html html5 replace str-replace

我尝试使用str_replace()来搜索和替换html页面中的特定字符串。例如,我正在替换:

$search_string = 'The new&nbsp;funding follows a <a href="http://blog.classpass.com/2015/01/15/were-so-excited-to-share-our-biggest-news-ever/">$40 million raise announced</a> in January.';

$replacement = '<span class="newString">The new&nbsp;funding follows a <a href="http://blog.classpass.com/2015/01/15/were-so-excited-to-share-our-biggest-news-ever/">$40 million raise announced</a> in January.</span>';

$subject = file_get_contents("some-web-site.html");

$new_string = str_replace($search_string, $replacement, $subject);

然而,当$subject包含大量html时,替换不起作用。如果我这样做:

$subject = "some text some text " .  $search_string . "some text some text";

句子被正确替换。问题似乎是由&nbsp;元素引起的..如果$search_string不包含&nbsp;,那么无论$subject元素的复杂性如何,它都会被成功替换(即使它包含完整的网页)。

知道为什么会这样吗?

1 个答案:

答案 0 :(得分:0)

$40的{​​{1}}部分似乎存在问题,而不是$search_string

以此程序为例:

&nbsp;

要解决此问题,请使用Reliable Collections,例如:

<?php

$input = '$40 &nbsp; replace failed'; // string literal

$str_replace_result = str_replace('$40 &nbsp; replace failed', 'str_replace worked', $input);
print_r($str_replace_result . "\n"); // ==> works

$preg_replace_result = preg_replace('/$40 &nbsp; replaced failed/', 'preg_replace worked', $input);
print_r($preg_replace_result . "\n"); // ==> fails

// Example without the "$40"
$another_string = '&nbsp; replace 2 failed';
$preg_replace_result2= preg_replace('/&nbsp; replace 2 failed/', 'preg_replace worked', $another_string);
print_r($preg_replace_result2. "\n"); // ==> works, implying the "$40" bit was the issue

preg_quote中的更多信息。

总之,问题是未转义的元字符导致匹配失败。

所有这一切,是否需要以这种方式替换整个网页?这种方法(这里显而易见)似乎容易出错。