PHP流下载网站内容,直到找到字符串

时间:2012-09-05 09:08:03

标签: php sockets web stream download

主题说明了一切。我需要启动一个网站流并在例如找到</head>。我想这样做是为了保留两端的带宽并节省脚本运行时间。

我不想将整个页面内容下载到内存中;我需要一个以PHP为单位的内容流。

谢谢社区,我爱你们:)

1 个答案:

答案 0 :(得分:1)

<?php

function streamUntilStringFound($url, $string, $timeout = 30){

    // remove the protocol - prevent the errors
    $url = parse_url($url);
    unset($url['scheme']);
    $url = implode("", $url);

    // start the stream
    $fp = @fsockopen($url, 80, $errno, $errstr, $timeout);
    if (!$fp) {
        $buffer = "Invalid URL!"; // use $errstr to show the exact error
    } else {
        $out  = "GET / HTTP/1.1\r\n";
        $out .= "Host: $url\r\n";
        $out .= "Connection: Close\r\n\r\n";
        fwrite($fp, $out);
        $buffer = "";
        while (!feof($fp)) {
            $buffer .= fgets($fp, 128);
            // string found - stop downloading any new content
            if (strpos(strtolower($buffer), $string) !== false) break;
        }
        fclose($fp);
    }

    return $buffer;

}

// download all content until closing </head> is found
$content = streamUntilStringFound("whoapi.com", "</head>");

// show us what is found
echo "<pre>".htmlspecialchars($content);

?>

重要提示: (感谢@GordonM)

需要在allow_url_fopen中启用

php.ini才能使用fsockopen()