Jsoup总是返回超时错误

时间:2017-03-07 06:45:46

标签: timeout jsoup

我有以下代码从yahoo finance.hk获得定价 但它始终会返回错误时间 请帮忙

  public GetStockPriceFromWebOneByOne(String url){
        this.url = url;
    }

     private void setDataFromAAStock() throws IOException, InterruptedException{
        Document document = Jsoup.connect(url).ignoreHttpErrors(true).timeout(timeOut*1000).get();  // s
        //TimeUnit.SECONDS.sleep(2);
        Elements answerers = document.select("div.yfi_rt_quote_summary div.yfi_rt_quote_summary_rt_top.sigfig_promo_0  span.time_rtq_ticker");
       // Elements answerers = document.select(".content .inline_block.vat.float_l .boxForex .font26 .neg .arr_ud.arrow_d6");
        for (Element answerer : answerers) {
            //System.out.print(answerer.text()+"\n");
            price = answerer.text();
           // splitString(answerer.text());
        }
    }

    public String getDataFromAAStock() throws IOException, InterruptedException{
        setDataFromAAStock();
        return price;
    }

2 个答案:

答案 0 :(得分:0)

我没有用yahoo finance hk查询,但你可能应该尝试在连接时设置合理的浏览器userAgent字符串。请参阅docs

Document document = Jsoup.connect(url)
                         .ignoreHttpErrors(true)
                         .timeout(timeOut*1000)
                         .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")

附录: 当然,您可以使用以下命令完全关闭超时:

Document document = Jsoup.connect(url)
                         .ignoreHttpErrors(true)
                         .timeout(0)

您是否使用浏览器开发人员工具查看了浏览器与网站之间的网络流量?它可能会帮助您分析潜在的问题。

答案 1 :(得分:0)

我会分割

Document document = Jsoup.connect(url).ignoreHttpErrors(true).timeout(timeOut*1000).get();

进入

Connection connect = Jsoup.connect(url)
                     .ignoreHttpErrors(true)
                     .timeout(timeOut*1000);

                     // use this for chrome
                     .userAgent("Mozilla");

System.out.println("Connection made BEFORE document.");
Document document = connect.get();
System.out.println("Connection made AFTER document.");

我认为您的“连接”存在问题,因为.get()在您致电.userAgent("Mozilla");之前可能需要.get();