HtmlUnit - 无法登陆结果页面

时间:2014-08-01 07:40:00

标签: java htmlunit

我正试图从网站上抓取动态内容。我无法使用Jsoup实现这一点,因为它只给了我静态页面源。所以,我切换到HtmlUnit来完成这项任务。

由于我是HtmlUnit的新用户,当我尝试按下搜索按钮时,我会遇到一些例外情况。

以下是我的代码: -

public static void main(String[] args) throws IOException {

    final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24);

    try {
        System.out.println("Querying");

        String url = "http://www.agoda.com/city/mumbai-in.html";

        HtmlPage page= webClient.getPage(url);
        HtmlForm form = page.getHtmlElementById("aspnetForm");

        HtmlSelect checkIn = (HtmlSelect) form.getSelectsByName("ddlCheckInDay").get(0);
        checkIn.setSelectedAttribute("12", true);

        HtmlSelect checkInMY = (HtmlSelect) form.getSelectsByName("ddlCheckInMonthYear").get(0);
        checkInMY.setSelectedAttribute("8,2014", true); 

        HtmlSelect nights = (HtmlSelect) form.getSelectsByName("ctl00$ctl00$MainContent$area_promo$CitySearchBox1$ddlNights").get(0);
        nights.setSelectedAttribute("1", true);

        final HtmlSubmitInput button = form.getInputByName("ctl00$ctl00$MainContent$area_promo$CitySearchBox1$SearchButton");

        final HtmlPage secondPage = button.click();

        System.out.println(secondPage.asXml());
        System.out.println("Success");
    } catch (final FailingHttpStatusCodeException e) {
        System.out.println("One");
        e.printStackTrace();
    } catch (final MalformedURLException e) {
        System.out.println("Two");
        e.printStackTrace();
    } catch (final IOException e) {
        System.out.println("Three");
        e.printStackTrace();
    } catch (final Exception e) {
        System.out.println("Four");
        e.printStackTrace();
    }
    System.out.println("Finished");
}

我得到以下异常: -

EcmaError: lineNumber=[767] column=[0] lineSource=[null] name=[TypeError] sourceName=[script in http://www.agoda.com/pages/agoda/default/DestinationSearchResult.aspx?asq=bs17wTmKLORqTfZUfjFABspANSEBVRKOEdlgdhMDKXq9AiQm2HGc5Vnb5H3nW0yov9IqRFL8sIj4SMPGpGP7KXtq7DnKdEOZThQP5gmE%2bQqFC%2b63so2JAJDOwZSQfHRSODUrGKb78ZtjV5%2fnfvQuD6eARBQJoMfccTv7dm7lSbHi9gFJ3zoRUUxA1bXicT8i&tick=635424957490 from (759, 32) to (801, 10)] message=[TypeError: Cannot read property "timing" from undefined (script in http://www.agoda.com/pages/agoda/default/DestinationSearchResult.aspx?asq=bs17wTmKLORqTfZUfjFABspANSEBVRKOEdlgdhMDKXq9AiQm2HGc5Vnb5H3nW0yov9IqRFL8sIj4SMPGpGP7KXtq7DnKdEOZThQP5gmE%2bQqFC%2b63so2JAJDOwZSQfHRSODUrGKb78ZtjV5%2fnfvQuD6eARBQJoMfccTv7dm7lSbHi9gFJ3zoRUUxA1bXicT8i&tick=635424957490 from (759, 32) to (801, 10)#767)]
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "timing" from undefined (script in http://www.agoda.com/pages/agoda/default/DestinationSearchResult.aspx?asq=bs17wTmKLORqTfZUfjFABspANSEBVRKOEdlgdhMDKXq9AiQm2HGc5Vnb5H3nW0yov9IqRFL8sIj4SMPGpGP7KXtq7DnKdEOZThQP5gmE%2bQqFC%2b63so2JAJDOwZSQfHRSODUrGKb78ZtjV5%2fnfvQuD6eARBQJoMfccTv7dm7lSbHi9gFJ3zoRUUxA1bXicT8i&tick=635424957490 from (759, 32) to (801, 10)#767)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:705)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:637)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:612)
    at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:1001)
    at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeEventListeners(EventListenersContainer.java:179)
    at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeBubblingListeners(EventListenersContainer.java:239)
    at com.gargoylesoftware.htmlunit.javascript.host.Node.fireEvent(Node.java:824)
    at com.gargoylesoftware.htmlunit.javascript.host.Node.fireEvent(Node.java:748)
    at com.gargoylesoftware.htmlunit.html.HtmlElement$1.run(HtmlElement.java:920)
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
    at com.gargoylesoftware.htmlunit.html.HtmlElement.fireEvent(HtmlElement.java:925)
    at com.gargoylesoftware.htmlunit.html.HtmlPage.executeEventHandlersIfNeeded(HtmlPage.java:1298)
    at com.gargoylesoftware.htmlunit.html.HtmlPage.initialize(HtmlPage.java:290)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:475)
    at com.gargoylesoftware.htmlunit.WebClient.loadDownloadedResponses(WebClient.java:2074)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:733)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.processPostponedActions(JavaScriptEngine.java:820)
    at com.gargoylesoftware.htmlunit.html.HtmlElement.click(HtmlElement.java:1325)
    at com.gargoylesoftware.htmlunit.html.HtmlElement.click(HtmlElement.java:1268)
    at com.gargoylesoftware.htmlunit.html.HtmlElement.click(HtmlElement.java:1216)
    at com.dhanraj.Main.main(Main.java:57)

JavaScript的:

function (a) {
    return typeof f != "undefined" && (!a || f.event.triggered !== a.type) ? f.event.dispatch.apply(i.elem, arguments) : b;
}

任何人都可以帮助我或纠正我吗?提前谢谢

1 个答案:

答案 0 :(得分:0)

现在工作正常。刚刚添加了一行代码

webClient.getOptions()setJavaScriptEnabled(假);

它就像一个魅力!!!!