我需要获取特定div的内容以添加到我的数据库链接和信息到外部网站
HTML就像
<div class="last">
<a href="...">test</a>
<b>title</b>
</div>
<div class="last">
<a href="...">test</a>
<b>title</b>
</div>
我用这个
$dom = new DOMDocument();
@$dom->loadHTML($html);
// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
$url2 = $href->getAttribute('u');
echo $url . '<br />';
echo $url2 . '<br />';
echo $target_url . '<br />';
//storeLink($url,$target_url);
//echo "<br />Link stored: $url";
}
什么是正则表达式只有a href
和b
进入所有div class=last
?
答案 0 :(得分:0)
使用此功能
function get_string($string, $start, $end)
{
$found = array();
$pos = 0;
while( true )
{
$pos = strpos($string, $start, $pos);
if ($pos === false)
{ // Zero is not exactly equal to false...
return $found;
}
$pos += strlen($start);
$len = strpos($string, $end, $pos) - $pos;
$found[] = substr($string, $pos, $len);
}
}
$html='<div class="last"><a href="...">test</a><b>title</b></div><div class="last"><a href="...">test</a><b>title</b></div>';
从Html获取Anchor
get_string($html,'class="last"','</a>');
类似于提取数据
get_string($html,'<b>','</b>');