我正在尝试在页面中获取表格的HTML标记:
$new_dom = new DOMDocument();
$table = '';
$nodesTable = $this->dom->getElementsbyTagName("table");
foreach($nodesTable as $nodeTable){
$color = $nodeTable->getAttribute('bordercolordark');
if ($color == '#73BAFF') {
$table = $nodeTable;
}
}
$new_dom->appendChild($table);
echo $new_dom->saveHTML();
这是somepage.html:
<html>
<table>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
</table>
<table border="1" cellpadding="0" width="500" bordercolorlight="#ACD6FF" bordercolordark="#73BAFF" align="center">
<tr>
<td rowspan="2" colspan="2" bgcolor="#73BAFF"> </td>
<td colspan="3" align="center" bgcolor="#ACD6FF"> Element 1 </td>
<td colspan="3" align="center" bgcolor="#ACD6FF"> Element 2 </td>
</tr>
<tr>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
<td width="50" align="center" bgcolor="#ACD6FF"> 50 </td>
</tr>
<tr>
<td bgcolor="#ACD6FF" width="155" align="center"> Row 1</td>
<td bgcolor="#ACD6FF" width="45" align="center"> 30 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
<td align="center"> 50 </td>
</tr>
<tr>
<td bgcolor="#ACD6FF" width="155" align="center"> Row 2</td>
<td bgcolor="#ACD6FF" width="45" align="center"> 30 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
<td align="center"> 60 </td>
</tr>
<tr>
<td bgcolor="#ACD6FF" width="155" align="center"> Row 3</td>
<td bgcolor="#ACD6FF" width="45" align="center"> 30 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
<td align="center"> 70 </td>
</tr>
</table>
<table>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
</table>
<table>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
<tr> <td> 10 </td> </tr>
</table>
</html>
$ new_dom只输出\ n而不是HTML标记。我试着看看这个答案:PHP DOMDocument stripping HTML tags,但是以这种方式附加表也不起作用。
答案 0 :(得分:2)
Fatal error: Uncaught exception 'DOMException' with message 'Wrong Document Error'
因此,您无法将节点从一个文档移动到另一个文档...如果您想这样做,则必须使用importNode() deep
标记。
$dom = new DOMDocument();
$dom->loadHTMLFile('x.html');
$new_dom = new DOMDocument();
$table = '';
$nodesTable = $dom->getElementsbyTagName("table");
foreach($nodesTable as $nodeTable){
$color = $nodeTable->getAttribute('bordercolordark');
if ($color == '#73BAFF') {
$table = $new_dom->importNode($nodeTable, true);
}
}
$new_dom->appendChild($table);
echo $new_dom->saveHTML();
这只导入表元素,但不导入子元素......
注意:我在您的案例libxml_disable_entity_loader(true);
中禁用了实体加载程序。我不确定XEE攻击是否也适用于loadHTML()
,但仅仅是出于安全考虑。