PHP代码减少HTML hX标签

时间:2012-10-14 14:11:38

标签: php html

  

可能重复:
  How to parse and process HTML with PHP?

我正在寻找一个将html hx(h1,h2,h3,...,h6)标签减一的php函数。

  • h1变为h2
  • h2变为h3,依此类推
  • ...
  • h6被''
  • 取代
你知道这样的功能吗?

这就是我开始剥离h6标签的方式:

$string = preg_replace('#<(?:/)?\s*h6\s*>#', ' ', $string);

1 个答案:

答案 0 :(得分:2)

这是DOM的一个,它通过所有映射进行迭代,然后替换标签或复制子项。

<?php

// New tag mappings:
//     null => extract childs and push them into parent contrainer
// Make sure that they are in this order, otherwise they would match wrongly
// between each another
$mapping = array(
    'h6' => null,
    'h5' => 'h6',
    'h4' => 'h5',
    'h3' => 'h4',
    'h2' => 'h3',
    'h1' => 'h2'
);

// Load document
$xml = new DOMDocument();
$xml->loadHTMLFile('http://stackoverflow.com/questions/12883009/php-code-to-decrease-html-hx-tags') or die('Failed to load');

$xPath = new DOMXPath( $xml);

foreach( $mapping as $original => $new){
    // Load nodes
    $nodes = $xPath->query( '//' . $original);

    // This is a critical error and should NEVER happen
    if( $nodes === false){
        die( 'Malformed expression: //' . $original);
    }

    echo $original . ' has nodes: ' . $nodes->length . "\n";

    // Process each node
    foreach( $nodes as $node){
        if( $new == null){
            // Append all the childs before self and remove self afterwards
            foreach( $node->childNodes as $child){
                $node->parentNode->insertBefore( $child->cloneNode( true), $node);
            }
            $node->parentNode->removeChild( $node);

        } else {
            // Create new empty node and push all childrens to it
            $newNode = $xml->createElement( $new);
            foreach( $node->childNodes as $child){
                $newNode->appendChild( $child);
            }
            $node->parentNode->replaceChild( $newNode, $node);
        }
    }
}

echo $xml->saveHTML();

您也可以使用//*//h3|//h2并检查xPath optimalizations来执行DOMElement::tagName,但我希望这是直截了当的。

<小时/>

编辑:仅通过节点一次并且不关心顺序的解决方案:

<?php
// The beginning (everything to the first foreach loop) remains the same
// Load nodes
$nodes = $xPath->query( '//*');

// This is a critical error and should NEVER happen
if( $nodes === false){
    die( 'Malformed expression: //' . $original);
}

// Process each node
foreach( $nodes as $node){
    // Check correct $node class
    if( !($node instanceof DOMElement)){
        continue;
    }
    $tagName = $node->tagName;  

    // Do we have a mapping?
    if( !array_key_exists( $tagName, $mapping)){
        continue;
    }
    $new = $mapping[$tagName];
    echo 'Has element: ' . $tagName . ' => ' . $new . "\n";

    if( $new == null){
        // Append all the childs before self and remove self afterwards
        foreach( $node->childNodes as $child){
            $node->parentNode->insertBefore( $child->cloneNode( true), $node);
        }
        $node->parentNode->removeChild( $node);

    } else {
        // Create new empty node and push all childrens to it
        $newNode = $xml->createElement( $new);
        foreach( $node->childNodes as $child){
            $newNode->appendChild( $child);
        }
        $node->parentNode->replaceChild( $newNode, $node);
    }
}

echo $xml->saveHTML();

我能想到的最后一个优化是使用:

$xPathQuery = '//' . implode( array_keys($mapping), '|//');
$nodes = $xPath->query( $xPathQuery);