curl_multi_exec - 解析html

时间:2011-03-10 05:44:14

标签: php curl html-parsing web-scraping

我在php.net上发现了这个脚本,让我说我只想从页面的一部分获取信息。如何做到这一点,我知道如何用curl_init做到这一点,但多样性似乎更有效率。

例如:

来自php.net

<?php
// create both cURL resources
$ch1 = curl_init();
$ch2 = curl_init();

// set URL and other appropriate options
curl_setopt($ch1, CURLOPT_URL, "http://lxr.php.net/");
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch2, CURLOPT_HEADER, 0);

//create the multiple cURL handle
$mh = curl_multi_init();

//add the two handles
curl_multi_add_handle($mh,$ch1);
curl_multi_add_handle($mh,$ch2);

$active = null;
//execute the handles
do {
    $mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {
    if (curl_multi_select($mh) != -1) {
        do {
            $mrc = curl_multi_exec($mh, $active);
        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
    }
}

//close the handles
curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);

?>

我想从请求中获取以下信息:

<b>Key enhancements in PHP 5.3.3 include:</b> 
     </p> 
     <ul> 
             <li>Upgraded bundled sqlite to version 3.6.23.1.</li> 
             <li>Upgraded bundled PCRE to version 8.02.</li> 
             <li>Added FastCGI Process Manager (FPM) SAPI.</li> 
             <li>Added stream filter support to mcrypt extension.</li> 
             <li>Added full_special_chars filter to ext/filter.</li> 
             <li>Fixed a possible crash because of recursive GC invocation.</li> 
             <li>Fixed bug #52238 (Crash when an Exception occured in iterator_to_array).</li> 
             <li>Fixed bug #52041 (Memory leak when writing on uninitialized variable returned from function).</li> 
             <li>Fixed bug #52060 (Memory leak when passing a closure to method_exists()).</li> 
             <li>Fixed bug #52001 (Memory allocation problems after using variable variables).</li> 
             <li>Fixed bug #51723 (Content-length header is limited to 32bit integer with Apache2 on Windows).</li> 
             <li>Fixed bug #48930 (__COMPILER_HALT_OFFSET__ incorrect in PHP &gt;= 5.3).</li> 
     </ul>

0 个答案:

没有答案