我有一个php DOM对象的问题 http://php.net/manual/en/class.domdocument.php
是否只能显示该表中第三个标记和第二个标记的内容?
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');
/*** get all rows from the table ***/
$rows = $tables->item(0)->getElementsByTagName('tr');
/*** loop over the table rows ***/
foreach ($rows as $row)
{
/*** get each column by tag name ***/
$cols = $row->getElementsByTagName('td');
/*** echo the values ***/
echo $cols->item(0)->nodeValue.'<br />';
echo $cols->item(1)->nodeValue.'<br />';
echo $cols->item(2)->nodeValue.'<br />';
echo $cols->item(3)->nodeValue.'<br />';
echo $cols->item(4)->nodeValue.'<br />';
echo $cols->item(5)->nodeValue.'<br />';
echo '<hr />';
}
编辑:
我收到此错误:致命错误:无法使用DOMNodeList类型的对象作为
中的数组<?php
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML('content.html');
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);
$selected = $xpath->query('//table/tr/td[first()+1]');
echo $selected[0]->nodeValue;
?>
EDIT2:
<?php
$output = file_get_contents('test.php');
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($output);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');//get all the tables
if($tables->length > 2) { //check there are more than 2
$thirdTable = $tables->item(2);
$cols = $thirdTable->getElementsByTagName('td');
/*** echo the values ***/
echo $cols->item(0)->nodeValue.'<br />';
echo $cols->item(1)->nodeValue.'<br />';
echo $cols->item(2)->nodeValue.'<br />';
echo $cols->item(3)->nodeValue.'<br />';
echo $cols->item(4)->nodeValue.'<br />';
echo $cols->item(5)->nodeValue.'<br />';
echo '<hr />';
}
?>
EDIT3 - 此代码仅显示第三个表标记中的内容。但它也只需显示第三个表中第二个tr标记的内容。
$html = file_get_contents('content.html');
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');
/*** get all rows from the table ***/
$rows = $tables->item(2)->getElementsByTagName('tr')->item(1);
/*** loop over the table rows ***/
foreach ($rows as $row)
{
/*** get each column by tag name ***/
$cols = $row->getElementsByTagName('td');
/*** echo the values ***/
echo $cols->item(0)->nodeValue.'<br />';
echo $cols->item(1)->nodeValue.'<br />';
echo $cols->item(2)->nodeValue.'<br />';
echo $cols->item(3)->nodeValue.'<br />';
echo $cols->item(4)->nodeValue.'<br />';
echo $cols->item(5)->nodeValue.'<br />';
echo '<hr />';
}
答案 0 :(得分:2)
我不明白你的问题。使用$cols->item(2)
,您获得了所需的第二个DOMElement。
如果您只想要第一个(或第二个......),您可以使用XPath
$xpath = new DOMXpath($document);
$selected = $xpath->query('//table/tr/td[first()+1] | //table/tr/td[first()+2]');
echo $selected[0]->nodeValue;
如果您不想使用DOMXPath,则可以使用getElementsByTagName 首先你得到所有的表格 然后你检查有超过2个 然后你拿第三个 然后你拿tr元素 你在数组中保留第二个和第三个
$tables = $dom->getElementsByTagName('table');//get all the tables
if($tables->length > 2){//check there are more than 2
$thirdTable = $tables->item(2);
//get the tr then td
}
答案 1 :(得分:1)
您正尝试在foreach上使用DOMNodeList。这是一个对象,而不是一个数组。您需要使用for loop来迭代它:
$tables = $dom->getElementsByTagName('table');
if( $tables->length < 3 ) {
// Ahh crap! There is no third table!
}
$thirdTable = $tables->item(2);
$rows = $thirdTable->getElementsByTagName('tr');
for( $i = 0; $i < $rows->length; $i++ ) {
$row = $rows->item( $i );
$cols = $row->getElementsByTagName('td');
$secondTd = $row->item( 1 );
$thirdTd = $row->item( 2 );
}