为什么Jsoup.connect.execute的响应为空?什么时候抛出IOException?

时间:2013-10-29 23:07:19

标签: java parsing jsoup

我的代码如下:

当我尝试使用不正确的网址调用此方法时,例如http://en.dddddddddssss.org/执行throw异常,响应为null。为什么?在那种情况下如何获得http代码?

public Document getDocumentFromUrl(String url) throws SiteBusinessException {
        Response response = null;
        try {
            response = Jsoup.connect(url).timeout(Constans.TIMEOUT).ignoreHttpErrors(false).userAgent(Constans.USER_AGENT)
                    .ignoreContentType(Constans.IGNORE_CONTENT_TYPE).execute();
            return response.parse();
        } catch (IOException ioe) {
            LOGGER.warn("Cannot fetch site ]");
            return null;
        }
    }

修改

public Document getDocumentFromUrl(String url) throws SiteBusinessException {
        Response response = null;
        try {
            response = Jsoup.connect(url).timeout(Constans.TIMEOUT).ignoreHttpErrors(false)
                    .userAgent(Constans.USER_AGENT).ignoreContentType(Constans.IGNORE_CONTENT_TYPE).execute();
            return response.parse();
        } catch (HttpStatusException hse) {
            LOGGER.warn("Cannot fetch site [url={}, statusMessage={}, statusCode={}]",
                    new Object[] { url, response != null ? response.statusMessage() : "<null>",
                            response != null ? String.valueOf(response.statusCode()) : "<null>" });
            throw new SiteBusinessException(response != null ? response.statusMessage() : "<null>",
                    String.valueOf(response != null ? response.statusCode() : "<null>"));

        } catch (IOException ioe) {
            LOGGER.warn("IOException. Cannot fetch site [url={}, errorMessage={}]", url, ioe.getMessage());
            throw new SiteBusinessException("Not found");
        }
    }

然后我试着打电话给http://localhost:8090/wrongaddress/。 Jboss返回HTTP 404.

但我的代码返回

Cannot fetch site [url=http://localhost:8090/wrongaddress/, statusMessage=<null>, statusCode=<null>]

修改

工作解决方案

try {
            response = Jsoup.connect(url).execute();
            return processDocument(response.parse(), url);
        } catch (IllegalArgumentException iae) {
            LOGGER.warn("Malformed URL [url={}, message={}]", new Object[] { url, iae.getMessage() });
            throw new SiteBusinessException(iae.getMessage());
        } catch (MalformedURLException mue) {
            LOGGER.warn("Malformed URL [url={}, message={}]", new Object[] { url, mue.getMessage() });
            throw new SiteBusinessException(mue.getMessage());
        } catch (HttpStatusException hse) {
            LOGGER.warn("Cannot fetch site [url={}, statusMessage={}, statusCode={}]",
                    new Object[] { url, hse.getMessage(), hse.getStatusCode() });
            throw new SiteBusinessException(hse.getMessage(), hse.getStatusCode());
        } catch (IOException ioe) {
            LOGGER.warn("IOException. Cannot fetch site [url={}, errorMessage={}]", url, ioe.getMessage());
            throw new SiteBusinessException("Cannot fetch site");
        }

1 个答案:

答案 0 :(得分:0)

不,它没有。您正在捕获异常并自行返回null。你永远不能抛出异常并同时返回一些东西。

没有HTTP代码,因为主机不存在。 HTTP代码由服务器返回。例如,最着名的代码是404(未找到)。当您的浏览器显示404时,它只是服务器发送给客户端的HTTP / TCP数据包,其中包含此代码。