我使用CURL检索页面并存储HTML。我成功地做到了这一点并最终得到一个包含与此类似的HTML的变量(td中的内容不一样并且总是在变化):
html code above....
<tr class="myclass">
<td>Dynamic Content One</td>
<td>Dynamic Content Two</td>
<td>Dynamic Content Three</td>
</tr>
<tr class="myclass">
<td>Dynamic Content One</td>
<td>Dynamic Content Two</td>
<td>Dynamic Content Three</td>
</tr>
More of the same <tr> ......
html code below....
我的目标是解析html并有一个名为result()的关联数组,它存储所有<tr>
个元素,数组应如下所示:
$result[0]["first_content"] = "Dynamic Content One"
$result[0]["second_content"] = "Dynamic Content Two"
$result[0]["third_content"] = "Dynamic Content Three"
$result[1]["first_content"] = "Dynamic Content One"
$result[1]["second_content"] = "Dynamic Content Two"
$result[1]["third_content"] = "Dynamic Content Three"
.. more elements in array depending on how many <tr> there was
我发现解析这样的东西很安静。我已经使用了DOMdocument模块和DOMXpath模块,但我所拥有的只是拥有一个包含每个<td>
元素的数组,并且不确定我将算法存储到数组中的位置。也许还有更好的方法吗?这是我目前的代码:
$dom = new DOMDocument;
@$dom -> loadHTML($retrievedHtml);
$xPath = new DOMXpath($dom);
$xPathQuery = "//tr[@class='myclass']";
$elements = $xPath -> query($xPathQuery);
if(!is_null($elements)){
$results = array();
foreach($elements as $element){
$nodes = $element -> childNodes;
print $nodes -> nodeValue;
foreach($nodes as $node){
$results[] = $node -> nodeValue;
}
}
答案 0 :(得分:1)
要实现输出数组的结构(减去文本键,如&#34; first_content&#34;等),则每行为数组添加一个新维度并填充该维度。我认为这是你想要实现的目标!
$dom = new DOMDocument;
@$dom->loadHTML( $retrievedHtml );
$xPath = new DOMXpath($dom);
$xPathQuery = "//tr[@class='myclass']";
$elements = $xPath -> query( $xPathQuery );
if( !is_null( $elements ) ){
$results = array();
foreach( $elements as $index => $element ){
$nodes = $element -> childNodes;
foreach( $nodes as $subindex => $node ){
/* Each table row is assigned in new level in array using $index */
if( $node->nodeType == XML_ELEMENT_NODE ) $results[ $index ][] = $node->nodeValue;
}
}
echo '<pre>',print_r( $results, true ),'</pre>';
}