我尝试使用PHP Simple HTML DOM仅获取文字 - > 1 2 3< - in span tag
这是我的HTML代码:
<div class="pager rel clr">
<span class="fbold prev abs large">
<a class="link pageNextPrev {page:1}" href="SOME LINK">
<span>«Prev</span>
</a>
</span>
<span class="item fleft">
<a class="block br3 brc8 large tdnone lheight24" href="SOME LINK">
<span>1</span>
</a>
</span>
<span class="item fleft">
<span class="block br3 c41 large tdnone lheight24 current">
<span>2</span>
</span>
</span>
<span class="item fleft">
<a class="block br3 brc8 large tdnone lheight24" href="SOME LINK">
<span>3</span>
</a>
</span>
<span class="fbold next abs large">
<a class="link pageNextPrev {page:3}" href="SOME LINK">
<span>Next»</span>
</a>
</span>
</div>
编辑我创建这样的PHP代码:
$e = $html->find('div.pager',0)->children();
foreach($e as $getnextpage=>$value){
if(is_numeric($value->plaintext)){
$yey = "This Number";
}else{
$yey = "Not Number";
}
echo "</br>";
print $yey . "==>" . $value->plaintext . "</br>";
}
结果:
不是数字==&gt; 1
不是数字==&gt; 2
不是数字==&gt;下一步»
cek数字如何......?
答案 0 :(得分:-1)
div.pager
递归span
个标记,因此您每次span
获取两次文本。如果您只想获取页码,请尝试以下操作:
$html = str_get_html($curlData);
$e = $html->find('div.pager');
foreach($e as $getnextpage)
{
foreach($getnextpage->find('span.fleft') as $get) {
$innerSpan = $get->find('span');
print($innerSpan);
}
}
答案 1 :(得分:-1)
这里唯一可能的改进是通过使用类似的东西摆脱foreach:
$html = str_get_html($curlData);
$e = $html->find('div.pager span.item a span');
foreach($e as $getnextpage)
{
if(stripos($getnextpage->innertText,'next') == false && stripos($getnextpage->innertText,'prev')
{
$pages[] = $getnextpage->innerText;
}
}
该行的替代
if(stripos($getnextpage->innertText,'next') == false && stripos($getnextpage->innertText,'prev')
可能是检查它是否是整数,例如
if(is_int($getnextpage->innertText))
上述所有内容的另一种替代方法是,您想要的跨度具有围绕它们的锚元素,并且只有具有数字的那些具有类block
。所以你可以做一下:
$html = str_get_html($curlData);
$e = $html->find('div.pager span.item a.block span');
foreach($e as $getnextpage)
{
$pages[] = $getnextpage->innerText;
}