我正在使用Flex 4框架在Adobe AIR中创建一个google刮刀 我遇到了一堵砖墙:Google在阅读了大约10页后强制使用验证码。
有谁能告诉我如何通过代理服务器获取页面?
我正在使用HTTPService
这是我的代码:
service=new HTTPService();
service.addEventListener(ResultEvent.RESULT, googleResult);
service.addEventListener(FaultEvent.FAULT, googleFault);
service.resultFormat="text";
service.url=_googleURL+keyPhrase.text
service.send();
干杯,
答案 0 :(得分:1)
解决方案:
我创建了一个扩展HTTPService
的ProxyHTTPService类package com.pageone.proxyserv {
import mx.rpc.AsyncToken;
import mx.rpc.http.mxml.HTTPService;
import mx.utils.URLUtil;
public class ProxyHTTPService extends HTTPService {
private var _finalURL:String;
private var _tempURL:String;
private var _proxy:Object;
private var phpProxyURL:String="http://myserver/proxy.php";
public function ProxyHTTPService(rootURL:String="") {
super();
}
public function get proxy():Object
{
return _proxy;
}
public function set proxy(value:Object):void
{
_proxy = value;
}
public function get finalURL():String {
return _finalURL;
}
public function set finalURL(value:String):void {
_finalURL=value;
}
override public function send(parameters:Object=null):AsyncToken {
this.url=phpProxyURL;
var proxyargs:Object=new Object();
proxyargs.proxy=_proxy.ip + ":" + _proxy.port;
_tempURL=_finalURL;
var params:String=URLUtil.objectToString(parameters, "&");;
if(_finalURL.indexOf("?") > 0) {
_tempURL += "&" + params;
} else {
_tempURL += "?" + params;
}
_tempURL=encodeURI(_tempURL);
_tempURL=replaceAll(_tempURL, "%253A", ":");
_tempURL=replaceAll(_tempURL, "%252F", "/");
proxyargs.url=_tempURL;
return super.send(proxyargs);
}
private function replaceAll(string:String, find:String, replace:String):String {
return string.split(find).join(replace);
}
}
}
然后我在服务器上创建了一个php页面
<?php
$url = $_GET["url"] or die("require url parameter");
$proxyuri = $_GET["proxy"] or die("require proxy parameter");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
curl_setopt($ch, CURLOPT_PROXY, $proxyuri);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 GTB7.1');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_HEADER, 0);
$exec=curl_exec ($ch);
curl_close($ch);
?>
现在在ActionScript中,您可以像这样调用ProxyHTTPService:
var p:ProxyHTTPService=new ProxyHTTPService;
p.addEventListenet(ResultEvent.RESULT, resultListener);
p.addEventListenet(FaultEvent.FAULT, faultListener);
p.finalURL="http://www.google.com/search";
p.proxy={ip: "xxx.xxx.xxx.xxx", port:8080};
p.send({q: "StackOverflow"});