Question

我正在尝试从此网站提取最新的4条新闻： http://www.wolverinegreen.com/sports/m-wrestl/spec-rel/utva-m-wrestl-spec-rel.html

他们没有rss feed，所以我一直在阅读使用php preg_match函数，但语法有点混乱，我不确定如何做到这一点。任何建议都会得到真正的赞赏，或者如果有一种我没有想过的更有效的方法，那么我愿意接受这些想法。

Answer 1

// Get the page's HTML
$html = file_get_contents("http://www.wolverinegreen.com/sports/m-wrestl/spec-rel/utva-m-wrestl-spec-rel.html");

// Create a DOMDocument object and load the html into it
$dom = new DOMDocument();
$dom->loadHTML($html);

// Create an XPath object using the DOMDocument
$xpath = new DOMXPath($dom);

// Query for the a link using xpath
$items = $xpath->query("//td[1]/div/div[1]/a");

// If we find something using that query
if($items->length)
{
    // Output each item
    foreach($items as $item)
        echo $item->nodeValue . " - " . $item->getAttribute("href") . "<br />";
}

从没有rss feed的外部网站拉最近的新闻项目 - preg_match（）？

1 个答案: