Jsoup HTTP错误提取URL。在进行请愿时,状态= 403

时间:2016-04-28 17:49:27

标签: java jsoup

我一直在寻找这个问题,据说使用用户代理修复了这个问题,但事实并非如此。
我试图做的是从请愿书中获取cookie,这是代码

note: i'm try to do the petition to https webpage

/*obtiene cookies de la peticion*/
        Connection.Response res = Jsoup.connect(liga).header("Content-Type","text/html;charset=UTF-8")
                .cookie("TALanguage", "ALL")
                .data("mode", "filterReviews")
                .data("filterRating", "")
                .data("filterSegment", "")
                .data("filterSeasons", "")
                .data("filterLang", "ALL")
                .referrer(liga)         
                .header("X-Requested-With", "XMLHttpRequest")
                .header("X-Puid",xpuid)
                .data("returnTo",returnTo)
                .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")                           
                .method(Method.POST)
                .execute();

        doc = res.parse();


        Map<String, String> cookies = res.cookies();

程序在.execute();行失败并在日志中出现此错误:

org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL=https://somepage.html

    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:459)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:434)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:181)
    at mx.oeste.crawler.htmlunit.obtenerComentarios(htmlunit.java:82)
    at mx.oeste.crawler.htmlunit.main(htmlunit.java:40)

1 个答案:

答案 0 :(得分:1)

尝试将内容类型标题设置为“application / x-www-form-urlencoded”,如下所示:

Connection.Response res = Jsoup.connect(liga)
                               .header("Content-Type","application/x-www-form-urlencoded")
                               .cookie("TALanguage", "ALL")
                               .data("mode", "filterReviews")
                               .data("filterRating", "")
                               .data("filterSegment", "")
                               .data("filterSeasons", "")
                               .data("filterLang", "ALL")
                               .referrer(liga)         
                               .header("X-Requested-With", "XMLHttpRequest")
                               .header("X-Puid",xpuid)
                               .data("returnTo",returnTo)
                               .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")                           
                               .method(Method.POST)
                               .execute();

如果它不起作用,请尝试在从请愿书中手动获取Cookie时监视您喜爱的浏览器。您可以使用开发人员工具来监视您的浏览器。