我尝试练习CURL,但是进展不顺利 请告诉我出了什么问题 这是我的代码
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://xxxxxxx.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, "Google Bot");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$downloaded_page = curl_exec($ch);
curl_close($ch);
preg_match_all('/<div\s* class =\"abc\">(.*)<\/div>/', $downloaded_page, $title);
echo "<pre>";
print($title[1]);
echo "</pre>";
,警告为Notice: Array to string conversion
我要解析的html就像这样
<div class="abc">
<ul> blablabla </ul>
<ul> blablabla </ul>
<ul> blablabla </ul>
</div>
答案 0 :(得分:1)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.lipsum.com/');
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$html = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument;
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
# foreach ($xpath->query('//div') as $div) { // all div's in html
foreach ($xpath->query('//div[contains(@class, "abc")]') as $div) { // all div's that have "abc" classname
// $div->nodeValue contains fetched DIV content
}
答案 1 :(得分:1)
preg_match_all返回一个数组数组。
如果您的代码是:
preg_match_all('/<div\s+class="abc">(.*)<\/div>/', $downloaded_page, $title);
您确实想要执行以下操作:
echo "<pre>";
foreach ($title[1] as $realtitle) {
echo $realtitle . "\n";
}
echo "</pre>";
因为它将搜索所有具有“abc”类的div。我还建议你强化你的正则表达式以使其更强大。
preg_match_all('/<div[^>]+class="abc"[^>]*>(.*)<\/div>/', $downloaded_page, $title);
这将与
匹配 BTW:DomDocument很慢,我发现有时正常情况(取决于你的文件的大小)可以提高40倍的速度。保持简单。最佳, 尼古拉斯