我想使用HttpClient类来连续提取许多条款的谷歌点击次数,但谷歌服务器不允许我重复执行此操作,你能帮助我吗?这是我的程序,参数Concept是我想要搜索的术语。
public static double extractGoogleCount(String Concept)
{
double temp = 0;
HttpClient httpClient = new HttpClient();
String url = "http://www.google.com/search?hl=en&newwindow=1&q="
+ Concept + "&aq=f&aqi=&aql=&oq=&gs_rfai=";
GetMethod getMethod = new GetMethod(url);
getMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,
new DefaultHttpMethodRetryHandler());
try
{
int statusCode = httpClient.executeMethod(getMethod);
if (statusCode != HttpStatus.SC_OK)
{
System.err.println("Method failed: "
+ getMethod.getStatusLine() + url);
}
InputStream responseBody = getMethod.getResponseBodyAsStream();
DataInputStream dis = new DataInputStream(responseBody);
String returnPage = dis.readLine();
while (returnPage != null)
{
int index = returnPage.indexOf("<div id=\"resultStats\">");
if (index == -1)
{
returnPage = dis.readLine();
continue;
}
String sub = returnPage.substring(index, index + 100);
if (sub.indexOf("About") >= 0)
{
String[] result = sub.split(" ");
String number = result[2].replaceAll(",", "");
temp = Double.parseDouble(number);
} else
{
String[] result = sub.split(" ");
String number = result[1].substring(result[1]
.indexOf(">") + 1);
System.out.println("number:" + number);
temp = Double.parseDouble(number);
}
break;
}
return temp;
} catch (HttpException e)
{
System.out.println("Please check your provided http address!");
e.printStackTrace();
} catch (IOException e)
{
e.printStackTrace();
}
catch (Exception e)
{
e.printStackTrace();
return temp;
} finally
{
httpClient.getState().clear();
getMethod.releaseConnection();
}
}
答案 0 :(得分:0)
Google仅允许来自单个客户端的每秒一定数量的请求。 尝试添加:
Thread.sleep(200);
代码,它应该工作。您可能想要创建另一个线程来完成工作,这样如果您需要以某种方式显示此数据,您可以使用您的程序执行其他操作