如何让php脚本使用代理列表

时间:2010-08-09 17:26:25

标签: php

我正在使用此处的Google pagerank检查脚本:

http://www.off-soft.net/en/develop/php/prcheck.html

但是我注意到,经过太多的请求,服务器暂时被禁止。

我想以某种方式通过代理服务器列表路由请求 - 任何人都可以让我开始吗?

我正在寻找使用代理列表的php请求的任何代码示例。

谢谢!

4 个答案:

答案 0 :(得分:4)

临时禁令是为了防止滥用。使用代理来绕过禁令并不是一件好事。所以,你不可能在这里找到任何人来帮助你违反该网站的服务条款。

话虽这么说,HTTP的代理只是一个网络服务器,它将处理/尊重外部/外部URL的请求并返回结果。其余部分留给了提问者。

答案 1 :(得分:4)

PHP's curl库允许您使用socks5和http代理。在使用代理服务器之前,应使用YAPH之类的工具验证代理服务器列表。

答案 2 :(得分:0)

使用Squid代理示例PHP CURL请求:

$proxy = "1.1.1.1:12121";
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
$url = "http://www.google.pt/search?q=anonymous";

$ch = curl_init();
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,15);
curl_setopt($ch, CURLOPT_HTTP_VERSION,'CURL_HTTP_VERSION_1_1' );
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYUSERPWD,'USER:PASS');
curl_setopt($ch, CURLOPT_USERAGENT,$useragent);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,0);
$result = curl_exec ($ch);
curl_close ($ch);

了解如何通过轮换传出ip here

来实现自己的squid代理

答案 3 :(得分:0)

如果您不介意使用付费API,那么gimmeproxy.com将为您服务。

<?php
function getProxy() {
    $data = json_decode(file_get_contents('http://gimmeproxy.com/api/getProxy?api_key=YOUR_API_KEY'), 1);
    if(isset($data['error'])) { // there are no proxies left for this user-id and timeout
        echo $data['error']."\n";
    } 
    return isset($data['error']) ? false : $data['curl']; //gimmeproxy returns 'curl' field that is CURLOPT_PROXY-ready string, see curl_setopt($curl, CURLOPT_PROXY, $proxy);
}

function get($url) {
    $curlOptions = array(
        CURLOPT_CONNECTTIMEOUT => 5, // connection timeout, seconds
        CURLOPT_TIMEOUT => 10, // total time allowed for request, second
        CURLOPT_URL => $url,
        CURLOPT_SSL_VERIFYPEER => false, // don't verify ssl certificates, allows https scraping
        CURLOPT_SSL_VERIFYHOST => false, // don't verify ssl host, allows https scraping
        CURLOPT_FOLLOWLOCATION => true, // follow redirects
        CURLOPT_MAXREDIRS => 9, // max number of redirects
        CURLOPT_RETURNTRANSFER => TRUE,
        CURLOPT_HEADER => 0,
        CURLOPT_USERAGENT => "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",
        CURLINFO_HEADER_OUT  => true,
    );
    $curl = curl_init();
    curl_setopt_array($curl, $curlOptions);
    if($proxy = getProxy()) {
        echo 'set proxy '.$proxy."\n";
        curl_setopt($curl, CURLOPT_PROXY, $proxy);
    }
    $data = curl_exec($curl);
    curl_close($curl);
    return $data;
}
while(true) {
    $data = get('https://news.ycombinator.com/');
    if(trim($data) && stripos($data, 'Hacker News') !== false) {
        echo "hacker news works fine";
        break;
    } else {
        echo "hacker news banned us, try another proxy\n";
    }
}