Question

我正在尝试使用JSoup下载页面内容。如果整个操作（打开连接+读取）需要超过8秒，我想立即中止。我假设timeout(int millis)方法的目的确实如此。根据javadoc：

设置请求超时（连接和读取）。如果发生超时，则将抛出IOException。默认超时为3秒（3000 米利斯）。超时为零被视为无限超时。

我写了一个模拟该操作的简单代码：

    final int TIME_OUT = 8000;
    final String USER_AGENT_STRING = "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0)";
    final String url = "http://reguler-pmb-tanggamus.va.web.id/";

    long time = System.currentTimeMillis();
    try {
        Document doc = Jsoup.connect(url).userAgent(USER_AGENT_STRING).timeout(TIME_OUT).get();
        System.out.println("Done crawling " + url + ", took " + (System.currentTimeMillis() - time) + " millis");
        System.out.println("Content: " + doc);
    } catch (Exception e) {
        System.out.println("Failed after " + (System.currentTimeMillis() - time) + " millis");
        e.printStackTrace();
    }

我尝试在一个“有问题”的网站上运行这个小脚本在单线程环境中。我认为无论是成功还是异常被捕获，操作时间都不应超过8秒（8000毫秒）。不幸的是，情况并非如此，因为有时它会在超过一分钟后成功（没有例外）：

Done crawling http://reguler-pmb-tanggamus.va.web.id/, took 68215 millis
Content: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> ...

，有时（非常罕见）失败（SocketTimeoutException）超过一分钟。

以前有没有人遇到过这种问题？

Answer 1

OP面临的问题似乎是Jsoup 1.8.3中的一个错误。

我能够重现你的发现。我建议您提交错误报告@ github.com/jhy/jsoup/issues(< em> luksch ）

OP提供了一个问题：https://github.com/jhy/jsoup/issues/628

Answer 2

JSoup团队（jhy）回答了我的问题：

设置连接和读取超时。读取超时意味着时间   读取之间。如果你有一台服务器运输内容很长时间   时间，但每次读数都是＆lt; 8秒，它不会超时。

实现最大计时器可能会很好，但事实并非如此   直截了当（需要监控线程和实用的方法）   关闭一个连接），这不是很多其他人拥有的   要求。

似乎这个问题不会很快修复。

Answer 3

/**
 * Set the maximum bytes to read from the (uncompressed) connection into the body, before the connection is closed,
 * and the input truncated. The default maximum is 1MB. A max size of zero is treated as an infinite amount (bounded
 * only by your patience and the memory available on your machine).
 * @param bytes number of bytes to read from the input before truncating
 * @return this Connection, for chaining
 */
Connection maxBodySize(int bytes);

Jsoup默认recv最大值为1MB

set＆＃34; Jsoup.connect（url）.maxBodySize（0）;＆＃34;也许修好了！

JSoup超时无法按预期工作

3 个答案: