HtmlUnit未加载此页面的一部分:
https://www.milanuncios.com/mis-anuncios/
使用浏览器检查时,该部分:
<div class="ma-LayoutBasicMainContent">
有很多内容,但是由HtmlUnit加载时为空
我尝试了各种webClient开关,包括
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setDownloadImages(true);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.setJavaScriptTimeout(10000);
但是总是相同的结果。未加载“ ma-LayoutBasicMainContent”部分。这是我使用的代码:
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.*;
class MarnvHtmlUnitTest {
public static void main(String[] args) {
WebClient webClient = null;
try {
final long javascriptTimeout = 10000;
webClient = new WebClient();
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setDownloadImages(true);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.setJavaScriptTimeout(10000);
String loginURL = "https://www.milanuncios.com/mis-anuncios/";
System.out.println("Connecting to " + loginURL + " (" + webClient.getBrowserVersion() + ")");
HtmlPage page = webClient.getPage(loginURL);
System.out.print(" Waiting for Javascript to complete...");
long millis = System.currentTimeMillis();
webClient.waitForBackgroundJavaScript(javascriptTimeout);
System.out.println(System.currentTimeMillis() - millis + " milliseconds");
if (!page.asText().contains("ACCESO A MIS ANUNCIOS")) {
System.out.println("ERROR!");
System.out.println(page.asXml());
System.out.println("EXITING. " + webClient.getWebWindows().size());
return;
}
System.out.println("OK");
} catch (Exception e) {
e.printStackTrace();
}
finally {
if (webClient != null)
webClient.close();
}
}
}
如果正确加载页面,则页面应包含文本“ ACCESO A MIS ANUNCIOS”。 请注意,waitForBackgroundJavaScript会立即返回,这对我来说很奇怪……它通常要等待几秒钟,直到页面完全加载。我正在使用HtmlUnit 2.35.0