Question

您好我使用preg_match_all函数来抓取页面上的内容但是当我尝试抓住某些特定部分（如详细信息部分）时，它会向我发送一个数组！

该页面上的代码结构是

<div class="f slp">DETAILS I WANT TO GET</div>

以前对于抓取网址我使用了像

这样的代码

//so this gets URLs in href=""
preg_match_all('/a href="([^"]+)" class=l.+?>.+?<\/a>/',$scraped,$results);

但是这次我想在

的结构下抓住该页面上的一些细节

<div class="f slp">DETAILS I WANT TO GET</div>

Answer 1

preg_match_all("#<div class=\"f slp\">(.*?)<\/div>#si", $source, $match);

foreach($match[1] as $val) {
    echo $val."<br>";
}

Answer 2

请查看PHP Simple HTML DOM Parser一个非常易于使用的库，它可以很容易地从html中提取内容。

// from the documentation
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);
echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

在manual

中阅读更多内容

如何使用preg_match_all获取页面中的特定部分

2 个答案: