嵌套UL LI到PHP数组 - 数组中的输出不正确

时间:2012-01-27 11:32:51

标签: php arrays recursion tree html-lists

这是昨天我的问题的后续行动 - Recursive UL LI to PHP multi-dimensional array - 我几乎设法将HTML块转换为数组,尽管有一些我无法解决的问题。处理下面的HTML块时,输出数组并不完全跟随输入的内容(我无法看到我出错的地方,需要一双新眼睛!)。

我已经包含以下内容:

  • HTML Block
  • PHP功能和处理
  • 输出

HTML阻止

基本上采取以下形式:

-A
  -B
    -C
----
-D
  -E
    -F
----
-G
  -H
    -I

如下:

<li>
    <ul>
        <li>A</li>
        <li>
            <ul>
                <li>B</li>
                <li>
                    <ul>
                        <li>C</li>
                    </ul>
                </li>
            </ul>
        </li>
    </ul>
</li>
<li>
    <ul>
        <li>D</li>
        <li>
            <ul>
                <li>E</li>
                <li>
                    <ul>
                        <li>F</li>
                    </ul>
                </li>
            </ul>
        </li>
    </ul>
</li>
<li>
    <ul>
        <li>G</li>
        <li>
            <ul>
                <li>H</li>
                <li>
                    <ul>
                        <li>I</li>
                    </ul>
                </li>
            </ul>
        </li>
    </ul>
</li>

PHP功能和处理

function process_ul($output_data, $data, $key, $level_data, $level_key){

    if(substr($data[$key], 0, 3) == '<ul'){
        // going down a level in the tree
        $level_key++;

        // check to see if the level key exists within the level data, else create it and set to zero
        if(!is_numeric($level_data[$level_key])){
            $level_data[$level_key] = 0;
        }

        // increment the key to look at the next line
        $key++;

        if(substr($data[$key], 0, 4) !== '</ul'){
            while(substr($data[$key], 0, 4) !== '</ul'){
                // whilst we don't have an end of list, do some recursion and keep processing the array

                $returnables = process_ul($output_data, $data, $key, $level_data, $level_key);
                $output_data = $returnables['output'];
                $data = $returnables['data'];
                $key = $returnables['key'];
                $level_data = $returnables['level_data'];
                $level_key = $returnables['level_key'];
            }
        }
    }

    if(substr($data[$key], 0, 4) !== '</ul' && $data[$key] !== "<li>" && $data[$key] !== "</li>"){
        // we don't want to be saving lines with no data or the ends of a list

        // get the array key value so we know where to save it in our array (basically so we can't overwrite anything that may already exist
        $this_key = &$output_data;
        for($build_key=0;$build_key<($level_key+1); $build_key++){
            $this_key =& $this_key[$level_data[$build_key]];
        }

        if(is_array($this_key)){
            // look at the next key, find the next open one
            $this_key[(array_pop(array_keys($this_key))+1)] = $data[$key];
        } else {
            // a new entry, so nothing to worry about
            $this_key = $data[$key];
        }
        $level_data[$level_key]++;
    } else if(substr($data[$key], 0, 4) == '</ul'){
        // going up a level in the tree
        $level_key--;
    }

    // increment the key to look at the next line when we loop in a moment
    $key++;

    // prepare the data to be returned
    $return_me = array();
    $return_me['output'] = $output_data;
    $return_me['data'] = $data;
    $return_me['key'] = $key;
    $return_me['level_data'] = $level_data;
    $return_me['level_key'] = $level_key;

    // return the data
    return $return_me;
}


// explode the data coming in by looking at the new lines
$input_array = explode("\n", $html_ul_tree_in); 

// get rid of any empty lines - we don't like those
foreach($input_array as $key => $value){
    if(trim($value) !== ""){
        $input_data[] = trim($value);
    }
}

// set the array and the starting level
$levels = array();
$levels[0] = 0;
$this_level = 0;

// loop around the data and process it
for($i=0; $i<count($input_data); $i){
    $returnables = process_ul($output_data, $input_data, $i, $levels, $this_level);
    $output_data = $returnables['output'];
    $input_data = $returnables['data'];
    $i = $returnables['key'];
    $levels = $returnables['level_data'];
    $this_level = $returnables['level_key'];
}

// let's see how we did
print_r($output_data);

输出

注意D位置错误,应位于[0] [2]位置 - 不是[0] [1] [2],D之后的每个其他位置都位于1位置(我确定你可以看看)。

基本上采取以下形式:

-A
  -B
    -C
  -D
----
  -E
    -F
  -G
----
  -H
    -I

如下:

Array
(
    [0] => Array
        (
            [0] => <li>A</li>
            [1] => Array
                (
                    [0] => <li>B</li>
                    [1] => Array
                        (
                            [0] => <li>C</li>
                        )

                    [2] => <li>D</li>
                )

            [2] => Array
                (
                    [1] => <li>E</li>

                    [2] => Array
                        (
                            [1] => <li>F</li>
                        )

                    [3] => <li>G</li>
                )

            [3] => Array
                (
                    [2] => <li>H</li>
                    [3] => Array
                        (
                            [2] => <li>I</li>
                        )

                )

        )

)

感谢您的时间 - 非常感谢您正确输出阵列的任何帮助!

2 个答案:

答案 0 :(得分:3)

IF 您的列表总是很好,您可以使用它来做您想要的。它使用SimpleXML,因此可以容忍输入代码中的错误和错误形式。如果你想宽容,你需要使用DOM - 代码会更复杂,但不是那么荒谬。

function ul_to_array ($ul) {
  if (is_string($ul)) {
    if (!$ul = simplexml_load_string("<ul>$ul</ul>")) {
      trigger_error("Syntax error in UL/LI structure");
      return FALSE;
    }
    return ul_to_array($ul);
  } else if (is_object($ul)) {
    $output = array();
    foreach ($ul->li as $li) {
      $output[] = (isset($li->ul)) ? ul_to_array($li->ul) : (string) $li;
    }
    return $output;
  } else return FALSE;
}

它采用问题中提供的确切形式的数据 - 没有外部封闭的<ul>标记。如果要将外部<ul>标记作为输入字符串的一部分传递,只需更改

即可
if (!$ul = simplexml_load_string("<ul>$ul</ul>")) {

if (!$ul = simplexml_load_string($ul)) {

See it working

答案 1 :(得分:1)

这是一个解析HTML的工作示例,并使用DOMDocument和domNodeToArray() - 这里提供的函数将其转换为数组:http://www.ermshaus.org/2010/12/php-transform-domnode-to-array

HTML不需要格式良好。

// $inputHTML is your HTML-list as a string

// this is necessary to prevent DOMDocument errors on HTML5-elements
libxml_use_internal_errors(true);

$dom = new DOMDocument();

// UTF-8 hack, to correctly handle UTF-8 through DOMDocument
$dom->loadHTML('<?xml encoding="UTF-8">' . $inputHTML);

// get the first list-element in the HTML-document
$listAsDom = $dom->getElementsByTagName('ul')->item(0);

// print it out as array
var_dump(domNodeToArray($listAsDom));


/**
 * Transforms the contents of a DOMNode to an associative array
 * @author Marc Ermshaus
 * http://www.ermshaus.org/2010/12/php-transform-domnode-to-array
 * 
 * @param DOMNode $node DOMDocument node
 * @return mixed Associative array or string with node content
 */
function domNodeToArray(DOMNode $node) {
    $ret = '';

    if ($node->hasChildNodes()) {
        if ($node->firstChild === $node->lastChild
            && $node->firstChild->nodeType === XML_TEXT_NODE
        ) {
            // Node contains nothing but a text node, return its value
            $ret = trim($node->nodeValue);
        } else {
            // Otherwise, do recursion
            $ret = array();
            foreach ($node->childNodes as $child) {
                if ($child->nodeType !== XML_TEXT_NODE) {
                    // If there's more than one node with this node name on the
                    // current level, create an array
                    if (isset($ret[$child->nodeName])) {
                        if (!is_array($ret[$child->nodeName])
                            || !isset($ret[$child->nodeName][0])
                        ) {
                            $tmp = $ret[$child->nodeName];
                            $ret[$child->nodeName] = array();
                            $ret[$child->nodeName][] = $tmp;
                        }

                        $ret[$child->nodeName][] = domNodeToArray($child);
                    } else {
                        $ret[$child->nodeName] = domNodeToArray($child);
                    }
                }
            }
        }
    }

    return $ret;
}