不完全确定我是否正确地问这个问题,但现在就这样了。我有一个html文件,其结构如下:
<div class="tbody">
<div class="row">
<div class="col th">
<a class="channel_sched_link" href="javascript:void(0)" title="Channel A schedule" data-channelid="9">
<img src="http://xxxxx/images/tv/A.JPG" width="30" height="20" alt="Channel A" />Channel A </a>
</div>
<div class="prog_cols">
<div class="col ts ts_1 prog_802176 ps_0" data-catid="" >
<span class="prog_name">First Program</span>
<div class="prog_time">February 24, 2015, 4:00 pm - 6:00 pm</div>
<a class="btn_watchlist " href="javascript:void(0)" data-progid="802176"> (+) add to watchlist</a>
<div class="prog_desc">
This is the first program for channel A.<br/>
<a class="watchnow" href="http://xxxx/channels/?q=Channel A">Watch Now</a>
</div>
</div>
<div class="col ts ts_3 prog_802177 ps_1" data-catid="" >
<span class="prog_name">Second Program</span>
<div class="prog_time">February 24, 2015, 6:00 pm - 8:00 pm</div>
<a class="btn_watchlist " href="javascript:void(0)" data-progid="802177">(+) add to watchlist</a>
<div class="prog_desc">
This is the second program for channel A.<br/>
<a class="watchnow" href="http://www.xxxxx/channels/?q=Channel A">Watch Now</a>
</div>
</div>
</div>
<a class="watchnow" href="http://xxxx/channels/?q=Channel A">Watch Now</a>
</div>
<div class="row">
<div class="col th">
<a class="channel_sched_link" href="javascript:void(0)" title="Channel B schedule" data-channelid="1">
<img src="http://xxxx/images/tv/B.gif" width="30" height="20" alt="Channel B" />Channel B </a>
</div>
<div class="prog_cols">
<div class="col ts ts_1 prog_802210 news ps_0" data-catid="news" >
<span class="prog_name">First Program</span>
<div class="prog_time">February 24, 2015, 5:00 pm - 6:00 pm</div>
<a class="btn_watchlist " href="javascript:void(0)" data-progid="802210">(+) add to watchlist</a>
<div class="prog_desc">
First Program Channel B.<br/>
<a class="watchnow" href="http://xxxxxx/channels/?q=Channel B">Watch Now</a>
</div>
</div>
我能够解析每个通道的prog_name,但只能使用
解析prog_name的第一个实例 $programname = $xpath->query('//span[@class="prog_name"]');
一旦我得到了这个,我将它与其他信息一起保存到xml文件中。如何解析每个通道的每个prog_name。我知道它可能与循环有关但我不知所措。并非每个通道都具有相同数量的prog_name。
答案 0 :(得分:1)
这适用于你的html:
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
$childs = $xpath->query('//span[@class="prog_name"]');
foreach ($childs as $child)
{
var_dump($child->nodeValue);
}
它返回:
string(13) "First Program"
string(14) "Second Program"
string(13) "First Program"