使用HtmlUnit登录

时间:2012-05-25 07:31:26

标签: java htmlunit

我对HtmlUnit非常新。我想知道我是否可以使用htmlunit登录网站并在网站中执行某些操作,例如我想登录我的办公室门户网站并暂时休假。

我使用的是html单元,它显示一些错误,是否可以使用html单元或是否有其他可用于此目的的工具 ...
这是我的代码

final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);        
webClient.setJavaScriptEnabled(true);
webClient.getCookieManager().setCookiesEnabled(true);

final HtmlPage page1 =  webClient.getPage("http://www.ccstechnologies.org/login.aspx/");
final HtmlForm form = page1.getFormByName("form1");         
final HtmlSubmitInput button =  form.getInputByName("BtnLogin");
final HtmlTextInput textField =  form.getInputByName("Username");
final HtmlPasswordInput pwd =  form.getInputByName("password");        
textField.setValueAttribute("username");
pwd.setValueAttribute("password");      

final HtmlPage page2 =  button.getEnclosingForm().click();  
String htmlBody = page2.getWebResponse().getContentAsString(); 

System.out.println("Base Uri 1 : "+page1);
System.out.println("Base Uri 2 : "+page2);

webClient.closeAllWindows();

但是当我打印page2时,它只显示登录页面的网址,并且它没有返回主页网址。可能是什么问题呢 ?

这是我点击表单时在控制台中获得的内容

  

2012年5月28日上午11:44:15 com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify   警告:遇到过时的内容类型:'application / x-javascript'。   2012年5月28日上午11:44:16 com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify   警告:遇到过时的内容类型:'application / x-javascript'。   Base Uri 1:HtmlPage(http://www.ccstechnologies.org/login.aspx/)@2741851   Base Uri 2:HtmlPage(http://www.ccstechnologies.org/login.aspx/)@ 2741851

单击按钮时生成的结果

May 29, 2012 10:00:02 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
May 29, 2012 10:00:02 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: [259:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: [259:29] Error in style rule. Invalid token "\r\n   ". Was expecting one of: "}", ";".
May 29, 2012 10:00:03 AM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: [259:29] Ignoring the following declarations in this rule.
HtmlPage(http://192.168.0.5/login.aspx)@23511316
HtmlPage(http://192.168.0.5/login.aspx)@17700115

4 个答案:

答案 0 :(得分:4)

好的,我调查了一遍,似乎问题在于按钮。我用这个代替了你的代码行:

 final HtmlPage page2 =  (HtmlPage) form.getInputByValue("Login").click();

现在它似乎至少尝试登录(当然页面打印无效登录)所以它应该使用适当的凭据。 在java中打印页面并查看它使用system.out.println(page1.asText())或asXml取决于你想看到的内容

我的代码终于来了:

final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);         
        webClient.setJavaScriptEnabled(true);
        webClient.getCookieManager().setCookiesEnabled(true);


     try{   final HtmlPage page1 =  webClient.getPage("http://www.ccstechnologies.org/login.aspx/");
        final HtmlForm form = page1.getFormByName("form1");         
        final HtmlSubmitInput button =  form.getInputByName("BtnLogin");
        final HtmlTextInput textField =  form.getInputByName("Username");
        final HtmlPasswordInput pwd =  form.getInputByName("password");        
        textField.setValueAttribute("username");
        pwd.setValueAttribute("password");      
System.out.println(page1.asText());
        final HtmlPage page2 =  (HtmlPage) form.getInputByValue("Login").click();

        String htmlBody = page2.getWebResponse().getContentAsString(); 
        System.out.println(page2.asText());
       System.out.println("Base Uri 1 : "+page1);
      System.out.println("Base Uri 2 : "+page2);

        webClient.closeAllWindows();}catch (Exception e) {
            // TODO: handle exception
        }

答案 1 :(得分:3)

这是您应该为javascript设置的内容:

webClient.getOptions().setJavaScriptEnabled(false);

你也可以添加它们。

webClient.getOptions().setRedirectEnabled(true);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.getOptions().setCssEnabled(false);
        webClient.getOptions().setUseInsecureSSL(true);
        webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
        webClient.getCookieManager().setCookiesEnabled(true);

这应该像我一样解决问题。

答案 2 :(得分:1)

尝试启用Cookie,并尝试启用javascript 忽略它可能打印的错误...(我以前认为红色的错误是坏的,在html单元中似乎不一定)

答案 3 :(得分:1)

如果网站使用ajax调用登录。 这对我有用。设置此

webClient.setAjaxController(new NicelyResynchronizingAjaxController());

这会导致所有ajax调用都是同步的。

这就是我设置WebClient对象的方法

WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getCookieManager().setCookiesEnabled(true);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getCookieManager().setCookiesEnabled(true);