使用https方案的URL中的Apache HttpClient和远程文件

时间:2013-09-10 15:20:25

标签: java https apache-commons-httpclient

我正在使用4.2.5版。来自org.apache.httpcomponents的AutoRetryHttpClient来从一个方案为 https 的网址下载pdf文件。代码使用NetBeans 7.3编写,并使用JDK7。

假设虚构的pdf资源位于https://www.thedomain.with/my_resource.pdf,那么我有以下代码:

SchemeRegistry registry = new SchemeRegistry();
    try {
        final SSLSocketFactory sf = new SSLSocketFactory(new TrustStrategy() {
            @Override
            public boolean isTrusted(X509Certificate[] chain, String authType)
                    throws CertificateException {
                return true;
            }
        });

        registry.register(new Scheme("https", 3920, sf));            
    } catch (NoSuchAlgorithmException | KeyManagementException | KeyStoreException | UnrecoverableKeyException ex) {
        Logger.getLogger(HttpConnection.class.getName()).log(Level.SEVERE, null, ex);
    }        
    //Here I create the client.
    HttpClient client = new AutoRetryHttpClient(new DefaultHttpClient(new PoolingClientConnectionManager(registry)),
            new DefaultServiceUnavailableRetryStrategy(5, //num of max retries
               100//retry interval)); 

        HttpResponse httpResponse = null;
        try {
            HttpGet httpget = new HttpGet("https://www.thedomain.with/my_resource.pdf");
            //I set header and Mozilla User-Agent
            httpResponse = client.execute(httpget);
        } catch (IOException ex) {
        }
        ... //other lines of code to get and save the file, not really important since the code is never reached

当我致电client.execute时,会抛出以下异常

org.apache.http.conn.HttpHostConnectException: Connection to https://www.thedomain.with refused

如何获取该pdf资源?

PS:我可以通过浏览器下载它,因此存在获取该文件的方法。

1 个答案:

答案 0 :(得分:0)

似乎有几个问题:

  • 您注册了Scheme以使用3920作为默认端口,这是HTTPS的非标准端口号。如果服务器实际上在该端口上运行,则您必须在浏览器中使用此URL进行访问:https://www.thedomain.with:3920/my_resource.pdf。由于您在浏览器中使用的URL不包含3920端口,因此服务器将在默认端口443上运行,因此您应该将更改new Scheme("https", 3920, sf)更改为new Scheme("https", 443, sf)
  • 服务器证书中的CN似乎与其主机名不匹配,导致SSLPeerUnverifiedException。为了使其工作,您需要使用SSLSocketFactory(TrustStrategy, HostnameVerifier)构造函数并传递不执行此检查的验证程序。 Apache为此目的提供了AllowAllHostnameVerifier

注意:您真的不应该在生产代码中使用no-op TrustStrategy和HostnameVerifier,因为这实质上会关闭对远程服务器进行身份验证的所有安全检查,并让您对模拟开放攻击。