我想与您分享如何检索由ajax更改的html页面的内容。
以下代码返回旧页面。
public class Test {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
String url = "valid html page";
WebClient client = new WebClient(BrowserVersion.FIREFOX_17);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setRedirectEnabled(true);
client.getOptions().setThrowExceptionOnScriptError(true);
client.getOptions().setCssEnabled(true);
client.getOptions().setUseInsecureSSL(true);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = client.getPage(url);
System.out.println(page.getWebResponse().getContentAsString());
}
}
这里发生了什么?
答案 0 :(得分:1)
答案是page.getWebResponse()赋予初始页面。
为了更新内容,我们必须使用页面变量本身
package utils;
import java.io.IOException;
import java.net.MalformedURLException;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class Test {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
String url = "valid html page";
WebClient client = new WebClient(BrowserVersion.FIREFOX_17);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setRedirectEnabled(true);
client.getOptions().setThrowExceptionOnScriptError(true);
client.getOptions().setCssEnabled(true);
client.getOptions().setUseInsecureSSL(true);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage page = client.getPage(url);
System.out.println(page.asXml());
System.out.println(page.getWebResponse().getContentAsString());
}
}
我在以下链接中找到了提示
http://htmlunit.10904.n7.nabble.com/Not-expected-result-code-from-htmlunit-td28275.html
Ahmed Ashour yahoo.com>写道: 嗨,您不应该使用WebResponse,这是为了从中获取实际内容 服务器。你应该使用htmlPage.asText()或.asXml()你的,艾哈迈德