htmlunit输入提交返回相同的页面

时间:2018-01-20 19:19:08

标签: forms login submit htmlunit

我正在尝试在网站上登录我的用户,其中登录按钮是提交类型的输入。我尝试了不同的策略来填充登录页面,我的登录数据和点击按钮,但我总是得到登录页面  作为click()的返回。我从未获得使用真实浏览器登录的页面。

这是我的代码:

String applicationName = "Mozilla";
String applicationVersion = "5.0 (Windows NT 6.3; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0";
final String userAgent = applicationName + "/" + applicationVersion;
BrowserVersion browserVersion = new BrowserVersion.BrowserVersionBuilder(BrowserVersion.FIREFOX_52)
      .setApplicationName(applicationName)
      .setApplicationVersion(applicationVersion)
      .setUserAgent(userAgent)
      .build();

webClient = new WebClient(browserVersion);
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.ALL); 
java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(java.util.logging.Level.ALL);

webClient.setAjaxController(new com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController());
webClient.setIncorrectnessListener(new com.gargoylesoftware.htmlunit.IncorrectnessListener() {
        @Override public void notify(String arg0, Object arg1) {} 
    });

webClient.setJavaScriptErrorListener(new com.gargoylesoftware.htmlunit.javascript.JavaScriptErrorListener() {
        @Override public void timeoutError(HtmlPage arg0, long arg1, long arg2) {}   
        @Override public void scriptException(final HtmlPage arg0, final com.gargoylesoftware.htmlunit.ScriptException arg1) {} 
        @Override public void malformedScriptURL(HtmlPage arg0, String arg1, java.net.MalformedURLException arg2) {}
        @Override public void loadScriptError(HtmlPage arg0, java.net.URL arg1, Exception arg2) {}
    });

webClient.setCssErrorHandler(new com.gargoylesoftware.htmlunit.SilentCssErrorHandler());
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setDoNotTrackEnabled(true);
webClient.getOptions().setActiveXNative(true);
webClient.getOptions().setRedirectEnabled(true);
webClient.getOptions().setPrintContentOnFailingStatusCode(true);
webClient.getCookieManager().setCookiesEnabled(true);

webClient.getOptions().setDownloadImages(true);

final int sleepMinSeconds = 1;
final int sleepRandomSeconds = 2;
final long javascriptTimeout = 10000;

System.out.println("Connecting to http://www.milanuncios.com... (" + webClient.getBrowserVersion() + ")");
String loginURL = "https://www.milanuncios.com/mis-anuncios";

System.out.print("    Waiting to avoid being detected as a robot...");
Thread.sleep((long)(Math.random()*sleepRandomSeconds) * 1000);
HtmlPage page = webClient.getPage(loginURL);
if (!page.asXml().contains("<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"es\" lang=\"es\">")) {
    System.out.println(page.asXml());
    System.out.println("\nDetectado como robot. Saliendo.");
    return;
}
System.out.print("    \nWaiting for Javascript to complete...");
webClient.waitForBackgroundJavaScript(javascriptTimeout);
System.out.println("\nOK");

System.out.println("\nDo login...");
System.out.print("    Waiting to avoid being detected as a robot...");
Thread.sleep((sleepMinSeconds + (long)(Math.random()*sleepRandomSeconds)) * 1000);

//<form method="post" action="" onsubmit="return estabien()" class="frmMisAnuncios">
HtmlForm loginForm = (HtmlForm)page.getFirstByXPath("//form[@onsubmit='return estabien()']");
((HtmlInput)loginForm.getOneHtmlElementByAttribute("input", "id", "email")).type("my username");
((HtmlInput)loginForm.getOneHtmlElementByAttribute("input", "id", "contra")).type("my password");
HtmlInput btnSend = (HtmlInput)loginForm.getOneHtmlElementByAttribute("input", "class", "submit btnSend");
page = btnSend.click();
System.out.print("    \nWaiting for Javascript to complete...");
webClient.waitForBackgroundJavaScript(javascriptTimeout);
System.out.println(page.asXml());

结果页面始终是登录页面。为什么呢?

登录表格如下:

<form method="post" action="" onsubmit="return estabien()" class="frmMisAnuncios">
    <input value="0" type=hidden id="recarga">
    <input value="0" type=hidden id="mensajes">
    <div class="sumario">
        <img src="https://static.milanuncios.com/imagenes/userarea/ic_user_avatar.png" width="40" height="40" alt="Acceso a mis anuncios"/> Acceso a mis anuncios
    </div>
    <div class="loginText">
        Email
    </div>
    <div>
        <input value="" type=text id="email" maxlength="50" class="field" tabindex="1" autofocus placeholder="Email">
    </div>
    <div>
        <input value="" type=password id="contra" maxlength="4" class="field" tabindex="2" placeholder="Contraseña">
    </div>
    <div>
        <input type="checkbox" value="s" id="rememberme" name="rememberme" class="field" tabindex="3" checked="checked"> No cerrar sesión
    </div>
    <div class="fbforgotpasstext">
        <a href="javascript:forgotPassword()" class="effect" id="txtforgotpassword">Olvidé mi contraseña</a>
    </div>
    <div class="btnEnviarFrm">
        <input type="submit" tabindex="3" class="submit btnSend" value="ENVIAR">
    </div>
</form>

和establishien()Javascript函数:

function estabien() {
    var email = document.getElementById('email').value;
    var contra = document.getElementById('contra').value;
    var recarga = document.getElementById('recarga').value;
    var mensajes = document.getElementById('mensajes').value;
    var rememberme = document.getElementById('rememberme').checked;
    if ((email == '') || (contra == '')) {
        alert('Por favor, proporcione el email y la contraseña de anuncio.');
        return false;
    }

    var ajax = newAjax();

    ajax.open("POST", '/cmd/');
    ajax.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
    ajax.send("comando=login&email=" + email + '&contra=' + contra + (rememberme ? '&rememberme=s' : ''));
    ajax.onreadystatechange = function () {
        if (ajax.readyState == 4) {
            oculta('espera');
            if (ajax.responseText == 'login') {
                if (recarga == 1) {
                    document.location = '/creditos/recargar.php';
                }
                else if (mensajes == 1) {
                    document.location = '/mis-mensajes/';
                }
                else document.location = '/mis-anuncios/';
            }
            else if (ajax.responseText.indexOf('emailantiguo') != -1) {
                alert('ATENCIÃN: El sistema de acceso a los anuncios ha cambiado ahora hay una sola contraseña para todos ' +
                        'tus anuncios, si no has recibido la nueva contraseña solicítala en recordatorio de contraseñas.');
            }
            else if (ajax.responseText.indexOf('emailantiguoenviado') != -1) {
                alert('ATENCIÃN: El sistema de acceso a los anuncios ha cambiado ahora hay una sola contraseña para todos ' +
                        'tus anuncios, acabamos de enviar a tu correo la nueva clave (si no aparece no olvides mirar en la carpeta de spam o correo no deseado).');
            }
            else if (ajax.responseText.indexOf('malcontra') != -1) {
                alert('El email o la contraseña no son correctos.\r\n\r\nRecuerda que las contraseñas de milanuncios\r\nson de 4 caracteres de longitud');
            }
            else {
                alert('Se produjo un error, inténtelo de nuevo ' + ajax.responseText);
                document.location = '/mis-anuncios/';
            }
        }
    };
    ver('espera');
    return false;
}

[编辑] 作为附加信息,首次加载登录页面时,它会显示Javascript错误:

EcmaError: lineNumber=[1] column=[0] lineSource=[  function () {]name=[TypeError] sourceName=[https://jssdk.pulse.schibsted.com/autoTrackerMilanuncios.min.js] message=[TypeError: Cannot read property "0" from undefined (https://jssdk.pulse.schibsted.com/autoTrackerMilanuncios.min.js#1)]
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "0" from undefined (https://jssdk.pulse.schibsted.com/autoTrackerMilanuncios.min.js#1)
....
== CALLING JAVASCRIPT ==
  function () {
      o.cb();
  }

点击()后出现Whem登录页面此错误不再出现

2 个答案:

答案 0 :(得分:0)

看起来你的登录做了一些ajax魔术,因此会进行某种重定向以获得正确的结果页面

document.location = '/mis-anuncios/';

此调用使用新网址的内容替换当前浏览器窗口的内容。因为ajax是异步的并且click()方法返回同步结果,所以页面变量指向初始页面。您必须在等待后重新获取当前窗口的内容。 尝试这样的事情:

page = btnSend.click();
System.out.print("    \nWaiting for Javascript to complete...");
webClient.waitForBackgroundJavaScript(javascriptTimeout);

// reget the page of the current window to deal with ajax redirects
page.getEnclosingWindow().getTopWindow().getEnclosedPage();

System.out.println(page.asXml());

希望有所帮助

答案 1 :(得分:0)

在RBRi的难以置信的帮助下,他找到了最简单的程序版本,我可以找到导致问题的代码行:

webClient.setAjaxController(new com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController());

没有这一行并添加RBRi建议的正确页面恢复:

page = (HtmlPage) page.getEnclosingWindow().getTopWindow().getEnclosedPage();

问题解决了!

我希望我可以通过这次讨论向有人提供帮助。再见。