我试图拉出一个远程html的特定部分...这是代码......
<div id="content">
<div class="main-wide">
<ul id="nav-sub">
<li id="sub-list"><a href="/events/" class="on">List View</a></li>
<li id="sub-cal"><a href="/events/calendar/">Calendar View</a></li>
</ul>
<h2 id="ev-201302">February 2013 <a href="/events/calendar/02/2013" title="Events Calendar for February 2013" class="cal">(Calendar View)</a></h2>
<ul class="lst lst-lg">
<li>
<h3><a href="http://site.com/link_1>link text one</a></h3>
<ul class="meta">
<li>February 1st - February 2nd, 2013</li>
</ul>
</li>
<li>
<h3><a href="http://site.com/link_2>link text two</a></h3>
<ul class="meta">
<li>February 1st - February 28th, 2013</li>
</ul>
</li>
<li>
<li>
<h3><a href="http://site.com/link_3>link text three</a></h3>
<ul class="meta">
<li>February 1st - February 15th, 2013</li>
</ul>
</li>
</ul>
</div>
</div>
我想要抓住<ul class='lst lst-lg'>
之间的所有内容并使其成为可以回应我想要的内容,以便它看起来像下面这样....
<tr>
<td align='left'>February 1st - February 2nd, 2013</td>
<td align='left'><a href='http://site.com/link_1'>Link text one</a></td>
</tr>
<tr>
<td align='left'>February 1st - February 28th, 2013</td>
<td align='left'><a href='http://site.com/link_2'>Link text two</a></td>
</tr>
<tr>
<td align='left'>February 1st - February 15th, 2013</td>
<td align='left'><a href='http://site.com/link_3'>Link text three</a></td>
</tr>
等等...到目前为止我有这个...
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$page = get_data('http://site.com/index.php');
$doc = new DOMDocument();
$dom->preserveWhiteSpace = false;
$doc->loadHTML($page);
$uls = $doc->getElementsByTagName('ul');
$i = 0;
while($table = $uls->item($i++)){
$class_node = $table->attributes->getNamedItem('class');
$li_node = $table->nodeName;
if($class_node){
echo $table->nodeName . " - " . $table->nodeValue . "<br>";
}
}
到目前为止,我一直试图回应这些值,因此while循环中的内容更多地是在播放和尝试学习。我已经能够从页面获取信息,但格式化并获得正确的信息是此时的问题。
非常感谢!