我正在使用4.2.5版。来自org.apache.httpcomponents的AutoRetryHttpClient来从一个方案为 https 的网址下载pdf文件。代码使用NetBeans 7.3编写,并使用JDK7。
假设虚构的pdf资源位于https://www.thedomain.with/my_resource.pdf
,那么我有以下代码:
SchemeRegistry registry = new SchemeRegistry();
try {
final SSLSocketFactory sf = new SSLSocketFactory(new TrustStrategy() {
@Override
public boolean isTrusted(X509Certificate[] chain, String authType)
throws CertificateException {
return true;
}
});
registry.register(new Scheme("https", 3920, sf));
} catch (NoSuchAlgorithmException | KeyManagementException | KeyStoreException | UnrecoverableKeyException ex) {
Logger.getLogger(HttpConnection.class.getName()).log(Level.SEVERE, null, ex);
}
//Here I create the client.
HttpClient client = new AutoRetryHttpClient(new DefaultHttpClient(new PoolingClientConnectionManager(registry)),
new DefaultServiceUnavailableRetryStrategy(5, //num of max retries
100//retry interval));
HttpResponse httpResponse = null;
try {
HttpGet httpget = new HttpGet("https://www.thedomain.with/my_resource.pdf");
//I set header and Mozilla User-Agent
httpResponse = client.execute(httpget);
} catch (IOException ex) {
}
... //other lines of code to get and save the file, not really important since the code is never reached
当我致电client.execute
时,会抛出以下异常
org.apache.http.conn.HttpHostConnectException: Connection to https://www.thedomain.with refused
如何获取该pdf资源?
PS:我可以通过浏览器下载它,因此存在获取该文件的方法。
答案 0 :(得分:0)
似乎有几个问题:
https://www.thedomain.with:3920/my_resource.pdf
。由于您在浏览器中使用的URL不包含3920端口,因此服务器将在默认端口443上运行,因此您应该将更改new Scheme("https", 3920, sf)
更改为new Scheme("https", 443, sf)
。SSLPeerUnverifiedException
。为了使其工作,您需要使用SSLSocketFactory(TrustStrategy, HostnameVerifier)
构造函数并传递不执行此检查的验证程序。 Apache为此目的提供了AllowAllHostnameVerifier
。注意:您真的不应该在生产代码中使用no-op TrustStrategy和HostnameVerifier,因为这实质上会关闭对远程服务器进行身份验证的所有安全检查,并让您对模拟开放攻击。