我创建了以下代码来显示一个空白页面,一个外部网站,但我不得不删除一些节点和创建一段代码所需的每个节点,如果它很大,它几乎不可行。项目
我的怀疑:
有没有办法放入一个我们想要消除的内容(页脚,标题,headerContent等)?
是否有更智能的方法来清理而不是删除元素,只显示我想要的内容(TABLE1)?
# Create a DOM parser object
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile('http://www.sptrans.com.br/sac/solicitacoes.aspx');
$data = $dom -> getElementByid('TABELA1');
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "novidadeDestaque")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "headerLvl1")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "headerContent")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "novo_menu")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "footer")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "header")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::id, "pageNovidades")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
echo $dom->saveHTML();
?>
</body>
答案 0 :(得分:1)
要创建短代码例程以消除所需的元素,您可以使用数组:
$xpath = new DOMXPath($dom);
$idToDelete = [ 'novidadeDestaque', 'headerLvl1', ... ];
foreach( $idToDelete as $id )
{
foreach($xpath->query('//div[contains(attribute::id, "'.$id.'")]') as $e ) {
$e->parentNode->removeChild($e);
}
}
请注意,您不需要为每次搜索创建新的DOMXPath
对象:每个DOMDocument
对象只能创建一次。
要仅显示您想要的内容,您可以使用以下语法:
$table = $dom->GetElementById( 'MyTable' );
echo $dom->saveHTML( $table );
要使用只有所需表格的完整HTML ,您可以创建新的DOMDocument
并使用importNode
添加您的表格:
$src = new DOMDocument();
$dst = new DOMDocument();
$src->loadHTML( $html );
$dst->loadHTML( '<html><head><title>Untitled</title></head><body></body></html>' );
$table = $src->GetElementById( 'MyTable' );
$imported = $dst->importNode( $table );
$dst->getElementsByTagName( 'body' )->item(0)->appendChild( $imported );
$dst->saveHTML();