我试图通过php dom解析器读取此HTML表的特定值。我希望我的代码只读取“td width”标签并仅从表中输出这些项目,如下所示:
“WAITLIST,91630,ACCY 2001,10,Intro Financial Accounting,3.00,Zou,Y,Duques 251,9:35 AM-10:50 AM,01/13 / 14-04 / 28/14”
这是HTML表格:
<table width="100%" border="0" cellspacing="1" cellpadding="0" bgcolor="#006699">
<tr align="center" class="tableRow1Font">
<td width="7%">WAITLIST</td>
<td width="5%">91630</td>
<td width="11%">
ACCY <A HREF="http://www.gwu.edu/~bulletin/ugrad/accy.html#2001" target="_blank">2001</A>
</td>
<td width="5%">10</td>
<td width="16%">Intro Financial Accounting</td>
<td width="6%">3.00</td>
<td width="8%"> Zou, Y</td>
<td width="8%"><A HREF="http://www.gwu.edu/~map/building.cfm?BLDG=DUQUES" target="_blank" >DUQUES</a> 251</td>
<td width="13%">TR<br>09:35AM - 10:50AM</td>
<td width="14%">
01/13/14 - 04/28/14
</td>
<td width="7%">
</td>
</tr>
</table
这是我的PHP代码,它抓取整个表,我的输出中不需要的一些元素,并多次重复输出:
// Retrieve the DOM from a given URL
$html = file_get_html('testdata.html');
foreach($html->find('table') as $e){
foreach($html->find('td') as $f){
echo $f->innertext . '<br>';
}
}
如何将代码更改为仅抓取并输出这些元素: “WAITLIST,91630,ACCY 2001,10,Intro Financial Accounting,3.00,Zou,Y,Duques 251,9:35 AM-10:50 AM,01/13 / 14-04 / 28/14”
答案 0 :(得分:1)
// Retrieve the DOM from a given URL
$html = file_get_html('testdata.html');
foreach($html->find('table') as $e){
foreach($e->find('td') as $f){
echo strip_tags($f->innertext) . '<br>';
}
}
你已经非常接近......
忘了标签。看看strip_tags是否适合您。