Question

我想通过使用file_get_contents和proxy从Internet读取一些页面/网站。我想出了以下代码：

$ opts = array（'http'=＆gt; array（'proxy'=＆gt; '14 .199.56.205：8909'，   'request_fulluri'=＆gt;真））;

$ context = stream_context_create（$ opts）;

$ test = file_get_contents（'http://www.google.com'，false，$ context）;

echo $ test;

我从位于http://www.hidemyass.com/proxy-list/

的列表中获取了代理

我测试了代理，它正在使用浏览器，但是使用file_get_contents我只收到空白页。

哪里出错？：）

Answer 1

免费代理被点击或错过，并且由于某种原因经常失败。这是我使用的一个函数，它将从寻找HTTP 200的代理数组中随机尝试2个代理。作为最后的手段，它使用anonymouse.org来获取文件。

function proxy($url) {

    $proxies = array(); 
    $proxies[] = '1.1.1.1:80';
    $proxies[] = '1.1.1.1:80';
    $proxies[] = '1.1.1.1:80';
    $proxies[] = '1.1.1.1:80';
    $proxies[] = '1.1.1.1:80';
    $proxies[] = '1.1.1.1:80';

    $http=0;
    $try=0;
    while (true) {
        $proxy = $proxies[array_rand($proxies)];
        if (!function_exists('curl_init')) { die('Sorry cURL is not installed!'); }
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_REFERER, "http://www.yomamma.com/");
        curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_TIMEOUT, 10);
        curl_setopt($ch, CURLOPT_PROXY, $proxy);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
        $output = curl_exec($ch);
        $http = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        curl_close($ch);
        if ($http==200) { break; }
        $try++;
        if($try>2) { break; }
    }

    if ($http!=200) {
        $output=file_get_contents("http://anonymouse.org/cgi-bin/anon-www.cgi/$url");
    } 

    return $output;

}

Answer 2

如今，大多数网站都使用HTTPS。因此，在$opts变量中，应使用“ HTTPS”而不是“ HTTP”。

file_get_contents通过代理

2 个答案: