$url = file_get_contents('test.html');
$DOM = new DOMDocument();
$DOM->loadHTML(mb_convert_encoding($url, 'HTML-ENTITIES', 'UTF-8'));
$trs = $DOM->getElementsByTagName('tr');
foreach ($trs as $tr) {
foreach ($tr->childNodes as $td){
echo ' ' .$td->nodeValue;
}
}
的test.html
<html>
<body>
<table>
<tbody>
<tr>
<td style="background-color: #FFFF80;">1</td>
<td><a href="test1.php" title="test1">test1</a></td>
</tr>
<tr>
<td style="background-color: #FFFF80;">2</td>
<td><a href="test2.php" title="test2">test2</a></td>
</tr>
<tr>
<td style="background-color: #FFFF80;">3</td>
<td><a href="test3.php" title="test3">test3</a></td>
</tr>
</tbody>
</table>
</body>
</html>
结果我得到:
1 test1 2 test2 3 test3
但是如何从td a
获取链接?
如何从td
获取HTML?
P.S。:我尝试使用$td->find('a');
和$td->getElementsByTagName('a');
,但它不起作用......
答案 0 :(得分:2)
我改进了你的代码,这个版本对我来说很好:
$DOM = new DOMDocument();
$DOM->loadHTML(mb_convert_encoding($url, 'HTML-ENTITIES', 'UTF-8'));
$trs = $DOM->getElementsByTagName('tr');
foreach ($trs as $tr) {
foreach ($tr->childNodes as $td){
if ($td->hasChildNodes()) { //check if <td> has childnodes
foreach($td->childNodes as $i) {
if ($i->hasAttributes()){ //check if childnode has attributes
echo $i->getAttribute("href") . "\n"; // get href="" attribute
}
}
}
}
}
<强>结果:强>
test1.php
test2.php
test3.php