我正在尝试使用HtmlUnit模拟我的Facebook页面的登录过程(我确实有充分的理由做同样的事情)。这是我的相同的java代码:
public static void main(String[] args) throws IOException {
//tried to experiment with the browser types also. But to the same result
//even using no param constructor does not help.
WebClient webClient=new WebClient(BrowserVersion.CHROME);
HtmlPage page1=webClient.getPage("https://www.facebook.com/bhramakarserver");
HtmlForm loginForm=(HtmlForm)page1.getElementById("login_form");
HtmlTextInput username=(HtmlTextInput)page1.getElementById("email");
HtmlPasswordInput password=(HtmlPasswordInput)page1.getElementById("pass");
username.setValueAttribute("myFbUsername");
password.setValueAttribute("myFbPassword");
HtmlElement button = (HtmlElement) page1.createElement("button");
button.setAttribute("type", "submit");
// append the button to the form
loginForm.appendChild(button);
page1=button.click();
//page1.executeJavaScript("window.scrollBy(0,6000)"); does not work
System.out.println(page1.asXml());
HtmlSpan postContentSpan=(HtmlSpan)page1.getByXPath("//span[@class='userContent']").get(0);
System.out.println(postContentSpan.asXml());
}
当我运行它时,我收到以下错误:
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at com.rahulserver.fbhighlight.Main.main(Main.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
很明显,致病系是
HtmlSpan postContentSpan=(HtmlSpan)page1.getByXPath("//span[@class='userContent']").get(0);
xpath返回null。我发布了与它相关的this问题并得出了包含上述xpath的代码被注释掉的答案,因此返回null。
那么为什么会发生这种情况以及如何使其发挥作用?随着页面加载进一步向下滚动,就像通常的Facebook一样,我尝试使用
来模拟该过程page1.executeJavaScript("window.scrollBy(0,6000)");
但它不起作用,我得到了相同的结果。这是生成的html文件的pastebin链接:http://pastebin.com/MfXsYSJQ。
我相信SO上的某个人能够想出一个开箱即用的答案......
。
答案 0 :(得分:0)
由于您正在使用的浏览器出现问题,因此需要添加AJAX支持和javascript等待。更改浏览器并需要添加更多行,如下所示:
WebClient webClient=new WebClient(BrowserVersion.FIREFOX_3_6);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.waitForBackgroundJavaScript(50000);
不推荐使用FireFox 3.6,但是应用程序运行时效果会更好。
如果符合你的要求,请随意选择正确的答案。
答案 1 :(得分:0)
以下代码在我的系统上运行。请找到代码
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlPasswordInput;
import com.gargoylesoftware.htmlunit.html.HtmlSpan;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
import java.io.IOException;
public class App {
public static void main(String[] args) throws IOException {
WebClient webClient=new WebClient(BrowserVersion.FIREFOX_3_6);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.waitForBackgroundJavaScript(50000);
HtmlPage page1=webClient.getPage("https://www.facebook.com/bhramakarserver");
HtmlForm loginForm=(HtmlForm)page1.getElementById("login_form");
HtmlTextInput username=(HtmlTextInput)page1.getElementById("email");
HtmlPasswordInput password=(HtmlPasswordInput)page1.getElementById("pass");
username.setValueAttribute("username");
password.setValueAttribute("password");
HtmlElement button = (HtmlElement) page1.createElement("button");
button.setAttribute("type", "submit");
// append the button to the form
loginForm.appendChild(button);
page1=button.click();
HtmlSpan postContentSpan=(HtmlSpan)page1.getByXPath("//span[@class='userContent']").get(0);
System.out.println("The content is "+postContentSpan.asXml());
}
}