我想登录网站(雅虎邮箱 - https://login.yahoo.com/config/login?.src=fpctx&.intl=us&.done=http%3A%2F%2Fwww.yahoo.com%2F)
使用HttpClient并在登录后我想检索内容。 (JAVA)。我的代码出了什么问题?
public class TestHttpClient {
public static void main(String[] args) throws Exception {
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://www.yahoo.com/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
System.out.println("Login form get: " + response.getStatusLine());
if (entity != null) {
entity.consumeContent();
}
System.out.println("Initial set of cookies:");
List<Cookie> cookies = httpclient.getCookieStore().getCookies();
if (cookies.isEmpty()) {
System.out.println("None");
} else {
for (int i = 0; i < cookies.size(); i++) {
System.out.println("- " + cookies.get(i).toString());
}
}
HttpPost httpost = new HttpPost("https://login.yahoo.com/config/login_verify2?.intl=us&.src=ym");
List <NameValuePair> nvps = new ArrayList <NameValuePair>();
nvps.add(new BasicNameValuePair("IDToken1", "Yahoo! ID"));
nvps.add(new BasicNameValuePair("IDToken2", "Password"));
httpost.setEntity(new UrlEncodedFormEntity(nvps, HTTP.UTF_8));
response = httpclient.execute(httpost);
System.out.println("Response "+response.toString());
entity = response.getEntity();
System.out.println("Login form get: " + response.getStatusLine());
if (entity != null) {
InputStream is = entity.getContent();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String str ="";
while ((str = br.readLine()) != null){
System.out.println(""+str);
}
}
System.out.println("Post logon cookies:");
cookies = httpclient.getCookieStore().getCookies();
if (cookies.isEmpty()) {
System.out.println("None");
} else {
for (int i = 0; i < cookies.size(); i++) {
System.out.println("- " + cookies.get(i).toString());
}
}
httpclient.getConnectionManager().shutdown();
}
}
当我从HttpEntity打印输出时,它打印登录页面内容。在使用HttpClient登录后如何获取页面内容?
答案 0 :(得分:2)
如果你看到雅虎登录源页面,你会发现你的请求中还没有发送许多其他参数。
<input type="hidden" name=".tries" value="1">
<input type="hidden" name=".src" value="fpctx">
<input type="hidden" name=".md5" value="">
<input type="hidden" name=".hash" value="">
<input type="hidden" name=".js" value="">
<input type="hidden" name=".last" value="">
<input type="hidden" name="promo" value="">
<input type="hidden" name=".intl" value="us">
<input type="hidden" name=".bypass" value="">
<input type="hidden" name=".partner" value="">
<input type="hidden" name=".u" value="a0bljsd77uima">
<input type="hidden" name=".v" value="0">
<input type="hidden" name=".challenge" value="sCm6Z8Bv1vy78LBlEd8dnFsmbit1">
<input type="hidden" name=".yplus" value="">
...
我想这就是为什么Yahoo了解登录失败并再次将您发送到登录页面的原因。该登录页面就是您所看到的响应。
许多网站都试图避免程序化登录(以避免机器人或其他安全问题),因此您可能很难做到正在尝试的事情。你可以: