我正在尝试删除文字前的所有<br>
。
所以我有这个:
<p>
<br/><br/>When the battle is on between contestants in a talent show, it gets really competitive when down to the last four. X-FactorUSAcontestant Marcus Canty knows this all too well as this is the stage he was voted off of the show earlier this year. <br/><br/>
</p>
我想摆脱前两个<br/>
,但如果超过2,我还想摆脱它们。
我更喜欢起诉xpath,因为我已经在使用它了,此刻我已经拥有了它。
foreach($xpath->query('//br[not(preceding::text())]') as $node) {
$node->parentNode->removeChild($node);
}
由于某些原因,在这个特定的页面上它似乎没有起作用。
更新
最初的问题是为什么在文档开头时我的xpath应该摆脱它们(见下文)。我应用了一些正则表达式,看看是否有效揭示了你现在看到的doctype。我认为doctype在某种程度上导致了我原来的问题,但它直到现在才被显示出来。这个内容是我从博客中导入的内容,目前正在操纵以适应新的博客。
!DOCTYPE html PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN” “http://www.w3.org/TR/REC-html40/loose.dtd”><br><br>
这是我的代码:
global $post;
$postTime = $post - > post_date;
$postTime = strtotime($postTime);
$startDate = "2014/01/16";
if ($postTime < strtotime($startDate)) {
$html = mb_convert_encoding($content, 'HTML-ENTITIES', "UTF-8");
$doc = new DOMDocument();@$doc - > loadHTML($html);
$xpath = new DOMXPath($doc);
foreach($xpath - > query('//br[not(preceding::text())]') as $node) {
$node - > parentNode - > removeChild($node);
}
$nodes = $xpath - > query('//a[string-length(.) = 0]');
foreach($nodes as $node) {
$node - > parentNode - > removeChild($node);
}
$nodes = $xpath - > query('//*[not(text() or node() or self::br)]');
foreach($nodes as $node) {
$node - > parentNode - > removeChild($node);
}
remove_filter('the_content', 'wpautop');
$content = $doc - > saveHTML();
$content = ltrim($content, '<br>');
$content = strip_tags($content, '<br> <a> <iframe>');
$content = preg_replace(array('/(<br\s*\/?>\s*){1,}/'), array('<br/><br/>'), $content);
$content = str_replace(' ', ' ', $content);
$content = "<p>".implode("</p>\n\n<p>", preg_split('/\n(?:\s*\n)+/', $content))."</p>";
return $content;
帮助表示赞赏。
答案 0 :(得分:1)
ltrim怎么样?
$string = ltrim($string, '<br/>');
答案 1 :(得分:0)
您可以尝试使用正则表达式
s/!DOCTYPE html PUBLIC “-\/\/W3C\/\/DTD HTML 4.0 Transitional\/\/EN” “http:\/\/www.w3.org\/TR\/REC-html40\/loose.dtd”>((<br[^>]*/>)+)(.*)/\3/
或在PHP中:
$pattern = '/^((<br[^>]*/>)+)(.*)/i';
$replacement = '$3';
$content = preg_replace($pattern, $replacement, $content);