我正在使用selenium登录网站,当它登录后我执行了一些任务。执行这些任务后,我必须下载并验证PDF文档的内容。由于selenium不支持下载文档,我决定使用Apache HttpClient 4.3.4。
我找到了this网站,它几乎涵盖了我认为GET请求所需的所有要点。在我实现此代码并将其更改为适用于HttpClient 4.3.4和我的网站后,我最终得到了以下代码:
public void dowloadFile(String dowloadURL, String outputFileLocation, String outputFileName) throws Exception {
SSLContextBuilder builder = new SSLContextBuilder();
builder.loadTrustMaterial(null, new TrustStrategy(){
public boolean isTrusted(X509Certificate[] chain, String authType)
throws CertificateException {
return true;
}
});
SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(
builder.build());
CookieStore cookieStore = seleniumCookiesToCookieStore();
CloseableHttpClient httpClient = HttpClients.custom()
.setSSLSocketFactory(sslsf)
.setDefaultCookieStore(cookieStore)
.build();
try {
HttpGet httpget = new HttpGet(baseURL + dowloadURL);
CloseableHttpResponse response = httpClient.execute(httpget);
try {
HttpEntity entity = response.getEntity();
if (entity != null) {
File directoriesFile = new File(outputFileLocation);
directoriesFile.mkdirs();
File outputFile = new File(outputFileLocation + "\\" + outputFileName);
InputStream inputStream = entity.getContent();
FileOutputStream fileOutputStream = new FileOutputStream(outputFile);
int read = 0;
byte[] bytes = new byte[1024];
while ((read = inputStream.read(bytes)) != -1) {
fileOutputStream.write(bytes, 0, read);
}
fileOutputStream.close();
log.info("Downloaded " + outputFile.length() + " bytes. " + entity.getContentType());
}
else {
log.warn("Download failed!");
}
} finally {
response.close();
}
} finally {
httpClient.close();
}
}
private CookieStore seleniumCookiesToCookieStore() {
Set<Cookie> seleniumCookies = this.driver.manage().getCookies();
CookieStore cookieStore = new BasicCookieStore();
for(Cookie seleniumCookie : seleniumCookies){
BasicClientCookie basicClientCookie =
new BasicClientCookie(seleniumCookie.getName(), seleniumCookie.getValue());
basicClientCookie.setDomain(seleniumCookie.getDomain());
basicClientCookie.setExpiryDate(seleniumCookie.getExpiry());
basicClientCookie.setPath(seleniumCookie.getPath());
cookieStore.addCookie(basicClientCookie);
}
return cookieStore;
}
执行此代码后,生成的文件是一个包含网站登录页面的html的文件。查看cookie,登录屏幕上有一个名为PD-S-SESSION-ID
的cookie,当我登录时,仍然有PD-S-SESSION-ID
cookie但具有不同的值和名为JSESSIONID
的新cookie。我想知道的是,如果我忽略了某些东西或者我的代码本身出了什么问题?