我通过递归函数在PHP上浏览HTML DOM
HTML DOM,我正在尝试转换为php数组
<head>
<title> My New Web Page </title>
</head>
<body>
<table>
<tr><td><h1> Welcome to My Web Page! </h1></td></tr>
<tr><td><div>Menu item 1<div>Menu item 2</div></div></td></tr>
</table>
</body>
$nodes_array[$recurse_count][$body_elem->tag] = $value;
在每次调用函数时将值设置为数组并将其作为结果。
Array
(
[1] => Array
(
[body] => table
)
[2] => Array
(
[table] => tr
)
[3] => Array
(
[tr] => td
)
[4] => Array
(
[td] => div
)
[5] => Array
(
[div] => div
)
)
但我想得到这个
Array
(
[1] => Array
(
[body] => Array
(
[table] => Array
(
[tr] =>
[0]=>Array
(
[td] => div
)
[1]=>Array
(
[td] => Array
(
[div] => div
)
)
)
)
)
我试图在没有足够知识的情况下使用变量引用。
功能代码 - &gt;
function recurve_extract($body_elem, $tag_str_name,$recurse_count)
{
global $nodes_array;
global $recurve_level;
if (sizeof($body_elem->children()) > 0);
{
foreach($body_elem->children() as $each_elem)
{
echo "<hr/>";
echo $tag_str_name = $tag_str_name . '[' . $each_elem->tag . ']';
$keys = explode('][', trim($tag_str_name, '[]'));
print_r($keys);
echo $body_elem->tag," == ".$each_elem->tag;
//$value = array($each_elem->tag=>"");
$value = $each_elem->tag;
// setValue($nodes_array,$keys,$value);
$nodes_array[$recurse_count][$body_elem->tag] = $value;
if($recurse_count<10)
{
recurve_extract($each_elem, $tag_str_name,$recurse_count+1);
}
}
}
}
recurve_extract($body_elem, '[body]',1);
print_r($nodes_array);
echo "</pre>";
答案 0 :(得分:0)
在递归函数中,存储父元素的标识符,然后在存储元素的实际值时,将其放在my_array[parent_level1][parent_level2][parent_level_x]
中。
答案 1 :(得分:0)
我已经设法将一些代码转换为几乎得到你想要的东西,这虽然将内容添加到每个节点....
$source = <<< XML
<html>
<head>
<title> My New Web Page </title>
</head>
<body>
<table>
<tr><td><h1> Welcome to My Web Page! </h1></td></tr>
<tr><td><div>Menu item 1<div>Menu item 2</div></div></td></tr>
</table>
</body>
</html>
XML;
function extractXML( $base, SimpleXMLElement $node)
{
$nodeName = $node->getName();
$childNodes = $node->children();
if ( count($childNodes) == 0 ) {
$base[ $nodeName ] = (string)$node;
}
else {
$new = [];
foreach ( $childNodes as $newNode ) {
$new[] = extractXML($base, $newNode);
}
$base[$nodeName] = count($new)>1?$new:$new[0];
}
return $base;
}
$body_elem = simplexml_load_string($source);
$nodes_array = extractXML([], $body_elem->body);
print_r($nodes_array);
递归函数可能很好,但您需要非常小心传入的内容以及传回的内容。使用global
会增加更多混淆,因此请尝试使其更加自包含。
此例程的作用是传递到目前为止的内容($base
)和要处理的节点($node
)。它循环遍历内容,并且如果存在子节点,则在每个点调用相同的例程。请注意,为了启动它,我传入body标签告诉它从哪里提取。
输出是......
Array
(
[body] => Array
(
[table] => Array
(
[0] => Array
(
[tr] => Array
(
[td] => Array
(
[h1] => Welcome to My Web Page!
)
)
)
[1] => Array
(
[tr] => Array
(
[td] => Array
(
[div] => Array
(
[div] => Menu item 2
)
)
)
)
)
)
)