所以,我有这个很棒的网络爬虫代码。它从所提到的站点获取请求的数据并粘贴以及与之关联的链接。 (好男孩)
现在的问题是,如何将提取的数据限制为5行。 我尝试使用“LIMIT 5”(我们通常在php sql查询中执行),但它不起作用..
我的代码如下::
<div class="news-entry">
<div class="newsblock">
<div style="clear:both"></div>
<h2>
<a rel="nofollow" target="_blank" href="http://www.usmle-forums.com/usmle-step-3-forum/">
USMLE-Forums :: STEP-3
</a>
</h2>
<ul>
<?php
function get_datafour($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
curl_close($ch);
return $result;
}
$returned_content = get_datafour('http://www.usmle-forums.com/usmle-step-3-forum/');
$first_step = explode( '<tbody id="threadbits_forum_30">' , $returned_content );
$second_step = explode('</tbody>', $first_step[1]);
$third_step = explode('<tr>', $second_step[0]);
// print_r($third_step);
foreach ($third_step as $element) {
$child_first = explode( '<td class="alt1"' , $element );
$child_second = explode( '</td>' , $child_first[1] );
$child_third = explode( '<a href=' , $child_second[0] );
$child_fourth = explode( '</a>' , $child_third[1] );
$final = "<a href=".$child_fourth[0]."</a></br>";
?>
<li target="_blank" class="itemtitle">
<span class="item_new"></span><?php echo $final?>
</li>
<?php
}
?>
</ul>
<div style="clear:both"></div>
</div>
</div>
任何建议都得到赞赏..
答案 0 :(得分:1)
在第5次结果后打破Foreach循环
foreach ($third_step as $key=>$element) {
//Your Logic Here
if($key==4){
break;
}
}
我们使用$ key == 4因为索引从0开始 希望你明白了