我有一些API的html输出,我想从输出中读取所有标签。
输入PHP脚本:
<table bgcolor="white" border="1" cellpadding="0" cellspacing="0" height="290" width="450" bordercolor="dodgerblue" align="center" class="txt">
<tbody>
<tr>
<td>
<table border="0" cellpadding="0" cellspacing="0" height="288" width="448" bgcolor="#ffffff" bordercolor="darkgray" class="txt">
<tbody>
<tr>
<td align="middle"><img height="18" src="/assets/images/dn1.gif" width="28"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"><img height="18" src="/assets/images/up1.gif" width="28"></td>
<td align="middle"><img height="18" src="/assets/images/dn1.gif" width="28"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"><img height="18" src="/assets/images/up1.gif" width="28"></td>
</tr>
<tr>
<td align="middle"></td>
<td align="middle"><img height="18" src="/assets/images/dn1.gif" width="28"></td>
<td align="middle"></td>
<td align="middle"><strong><img src="/assets/images/5.gif" width="28" height="18"></strong></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"><img height="18" src="/assets/images/up1.gif" width="28"></td>
<td align="middle"><strong><img src="/assets/images/4.gif" width="28" height="18"></strong></td>
<td align="middle"></td>
<td align="middle"><img height="18" src="/assets/images/dn1.gif" width="28"></td>
<td align="middle"></td>
<td align="middle"></td>
<td align="middle"><strong><img src="/assets/images/3.gif" width="28" height="18"></strong></td>
<td align="middle"></td>
<td align="middle"><img height="18" src="/assets/images/up1.gif" width="28"></td>
<td align="middle"></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
我希望脚本的输出采用数组的形式,如下所述:
array(
[0] => First td content
[1] => Second td content
.
.
. so on...
)
我试过这个http://www.phpclasses.org/package/3022-PHP-Parse-HTML-tables-and-extract-data-into-arrays.html,但它没有用......
答案 0 :(得分:2)
目标是在@src
内抓取每个<img>
的{{1}}属性值,同时保留正确的td索引,这样的事情就应该这样做。
<td>
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$tds = $xpath->query('//td[not(descendant::td)]');
$output = [];
foreach ($tds as $td) {
$data = null;
$sources = $xpath->query('.//img/@src', $td);
foreach ($sources as $src) {
$data = $src->value;
}
$output[] = $data;
}
var_export($output);