我正在尝试登录我想要下载的网站,并在登录后立即解析HTML页面。为了测试我正在使用以下JUnit测试:
@Test
public void testLogin() throws IOException, URISyntaxException {
CloseableHttpClient httpClient = null;
CloseableHttpResponse loginResponse = null;
CloseableHttpResponse getDataResponse = null;
try {
CookieStore cookieStore = new BasicCookieStore();
RequestConfig globalConfig = RequestConfig.custom()
.setCookieSpec(CookieSpecs.STANDARD)
.build();
HttpClientContext context = HttpClientContext.create();
context.setCookieStore(cookieStore);
httpClient = HttpClients.custom()
.setDefaultRequestConfig(globalConfig)
.setDefaultCookieStore(cookieStore)
.setRedirectStrategy(new LaxRedirectStrategy())
.build();
List<NameValuePair> formparams = new ArrayList<>();
formparams.add(new BasicNameValuePair("user", "username"));
formparams.add(new BasicNameValuePair("pass", "password"));
formparams.add(new BasicNameValuePair("submit", "Login"));
formparams.add(new BasicNameValuePair("logintype", "login"));
formparams.add(new BasicNameValuePair("pid", "1"));
formparams.add(new BasicNameValuePair("redirect_url", ""));
formparams.add(new BasicNameValuePair("tx_felogin_pil[noredirect]", "0"));
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams);
HttpPost login = new HttpPost("https://localhost/");
login.setEntity(entity);
loginResponse = httpClient.execute(login, context);
List<Cookie> cookies = cookieStore.getCookies();
if (cookies.isEmpty()) {
System.out.println("None");
} else {
for (int i = 0; i < cookies.size(); i++) {
System.out.println("- " + cookies.get(i).toString());
}
}
loginResponse.getEntity().writeTo(System.out);
HttpGet getData = new HttpGet("https://localhost/");
getDataResponse = httpClient.execute(getData, context);
getDataResponse.getEntity().writeTo(System.out);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (loginResponse != null) {
loginResponse.close();
}
if (getDataResponse != null) {
getDataResponse.close();
}
if (httpClient != null) {
httpClient.close();
}
}
执行登录请求后,CookieStore中有所需的会话,但响应的实体仍包含登录页面的HTML。执行getData请求后问题仍然存在。仍然是HttpResponse实体中的登录页面内容。
使用以下curl命令可以正常工作:
curl --data "user=username&pass=password" https://localhost/
你知道出了什么问题吗?