卷发杀死单个代理上的所有请求失败

时间:2018-07-25 07:27:33

标签: php curl curl-multi

目前,我们发现我们的multicurl无法连接到代理时会发疯。一个失败的代理足以使curl使整个批次失败。

问题是,代理服务器上的一个超时可以关闭所有连接,这种情况发生在三个错误上:Connection timed outProxy CONNECT aborted due to timeoutSSL connection timeout

然后curl关闭所有连接,并且批处理失败。

预期的行为:仅将失败的请求返回为失败,其余请求成功完成并返回。

实际行为:一个失败的请求导致所有连接被关闭,因此不会返回所有成功的结果。在我下面介绍的情况下,只有一个结果作为成功返回

单个curl资源创建(Config只是对象保存配置数据):

public function create(Config $config)
{
    $curlResource = curl_init($config->getUrl());

    curl_setopt($curlResource, CURLOPT_TIMEOUT, $config->getTimeout());
    curl_setopt($curlResource, CURLOPT_CONNECTTIMEOUT, 5);
    curl_setopt($curlResource, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($curlResource, CURLOPT_SSL_VERIFYPEER, $config->getSslCertificateValidation());
    curl_setopt($curlResource, CURLOPT_SSL_VERIFYHOST, $config->getSslCertificateValidation() ? 2 : 0);
    curl_setopt($curlResource, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($curlResource, CURLOPT_HEADER, true);
    curl_setopt($curlResource, CURLOPT_CUSTOMREQUEST, $config->getMethod());
    curl_setopt($curlResource, CURLOPT_VERBOSE, true);
    curl_setopt($curlResource, CURLOPT_USERAGENT, $config->getUserAgent());

    curl_setopt($curlResource, CURLOPT_MAXREDIRS, $config->getMaxRedirects());

    curl_setopt($curlResource, CURLOPT_HTTPHEADER, $this->processHeaders($config->getHeaders()));

    $proxyConfig = $config->getProxyConfig();

    curl_setopt($curlResource, CURLOPT_PROXY, $proxyConfig->getUrl());
    curl_setopt($curlResource, CURLOPT_PROXYPORT, $proxyConfig->getPort());

    curl_setopt($curlResource, CURLOPT_PROXYUSERPWD, $proxyConfig->getUsername().':'.$proxyConfig->getPassword());
    return $curlResponse
}

多配置/客户端:

$curlResources = []; //array of resources made by create(Config $config);  
$mh = curl_multi_init();

foreach ($curlResources as $curlResource) {
    curl_multi_add_handle($mh, $curlResource);
}

$active = null;
do {
    $mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {
    // Wait for activity on any curl-connection
    if (curl_multi_select($mh) == -1) {
        usleep(1);
    }

    // Continue to exec until curl is ready to
    // give us more data
    do {
        $mrc = curl_multi_exec($mh, $active);
    } while ($mrc == CURLM_CALL_MULTI_PERFORM);
}

foreach ($curlResources as $index => $curlResource) {

    $errorMessage = curl_error($curlResource);

    if (!$errorMessage) {
        //process error
    } else {
        //process success
    }

    curl_multi_remove_handle($mh, $curlResource);

    curl_close($curlResource);
}

curl_multi_close($mh);

这里是curl的详细输出:

Starting (PID: 23801)...
starting..
scrape limit test completed, continue..
getting proxy list..
proxy retrieved, continue to getting serp list..
*   Trying 11.148.119.10...
*   Trying 181.180.197.12...
*   Trying 12.168.17.164...
*   Trying 181.181.191.151...
*   Trying 181.134.18.121...
*   Trying 178.151.187.1...
*   Trying 185.159.12.141...
*   Trying 185.16.100.133...
*   Trying 185.18.12.11...
* Connected to 11.148.119.10 (11.148.119.10) port 5000 (#0)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'user4058'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic dXNlcjQwNTg6RnZRT29KMU5rcg==
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.7
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 181.180.197.12 (181.180.197.12) port 5000 (#1)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'user1760'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic dXNlcjE3NjA6R3BMbHA4enJJaw==
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; Touch; rv:11.0) like Gecko
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 12.168.17.164 (12.168.17.164) port 5000 (#2)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'user3193'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic dXNlcjMxOTM6R3BMbHA4enJJaw==
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/601.7.1 (KHTML, like Gecko) Version/9.1.2 Safari/601.7.1
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 181.181.191.151 (181.181.191.151) port 5000 (#3)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'user3085'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic dXNlcjMwODU6R3BMbHA4enJJaw==
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 181.134.18.121 (181.134.18.121) port 3128 (#4)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'clbaddr03810'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic Y2xiYWRkcjAzODEwOmNMaW0yMjUx
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 178.151.187.1 (178.151.187.1) port 3128 (#5)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'clbaddr02305'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic Y2xiYWRkcjAyMzA1OmNMaW0yMjUx
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/600.3.18 (KHTML, like Gecko) Version/8.0.3 Safari/600.3.18
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 185.16.100.133 (185.16.100.133) port 5000 (#7)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'user1927'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic dXNlcjE5Mjc6aWhxSXBNaGpyeg==
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* Connected to 185.18.12.11 (185.18.12.11) port 5000 (#8)
* Establish HTTP proxy tunnel to www.google.cz:443
* Proxy auth using Basic with user 'user1589'
> CONNECT www.google.cz:443 HTTP/1.1
Host: www.google.cz:443
Proxy-Authorization: Basic dXNlcjE1ODk6WVVjNXkyZ3hQVw==
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0
Proxy-Connection: Keep-Alive

* Proxy CONNECT aborted due to timeout
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 0
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 1
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 2
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 3
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 4
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 5
* Connection timed out after 5000 milliseconds
* Closing connection 6
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 7
* Empty reply from server
* Connection #8 to host 185.18.12.11 left intact

处理后得到的结果:

serp list retrieved, processing errors
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 11.148.119.10
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 181.180.197.12
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 12.168.17.164
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 181.181.191.151
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 181.134.18.121
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 178.151.187.1
SERP LIST error:Connection timed out after 5000 milliseconds on 185.159.12.141
SERP LIST error:Operation timed out after 0 milliseconds with 0 out of 0 bytes received on 185.16.100.133
SERP LIST error:Proxy CONNECT aborted due to timeout on 185.18.12.11

有没有一种方法可以配置multicurl在这种情况下不会失败?

0 个答案:

没有答案