我正在尝试从此网站提取最新的4条新闻: http://www.wolverinegreen.com/sports/m-wrestl/spec-rel/utva-m-wrestl-spec-rel.html
他们没有rss feed,所以我一直在阅读使用php preg_match函数,但语法有点混乱,我不确定如何做到这一点。任何建议都会得到真正的赞赏,或者如果有一种我没有想过的更有效的方法,那么我愿意接受这些想法。
答案 0 :(得分:1)
// Get the page's HTML
$html = file_get_contents("http://www.wolverinegreen.com/sports/m-wrestl/spec-rel/utva-m-wrestl-spec-rel.html");
// Create a DOMDocument object and load the html into it
$dom = new DOMDocument();
$dom->loadHTML($html);
// Create an XPath object using the DOMDocument
$xpath = new DOMXPath($dom);
// Query for the a link using xpath
$items = $xpath->query("//td[1]/div/div[1]/a");
// If we find something using that query
if($items->length)
{
// Output each item
foreach($items as $item)
echo $item->nodeValue . " - " . $item->getAttribute("href") . "<br />";
}