我有来源的页面:
<!DOCTYPE HTML>
<html>
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=600, initial-scale=1">
<title>Binary Options Platform</title>
<link rel="stylesheet" href="ext/ext-theme-binary-options.css">
<link rel="stylesheet" href="resources/app.css">
<script src="ext/ext-all.js"></script>
<script src="app.js"></script>
<!--[if IE]>
<script src="resources/js/chart/iecanvas.js"></script>
<![endif]-->
<link rel="shortcut icon" href="resources/images/favicon.ico" type="image/x-icon"/>
</head>
<body>
<noscript>You must enable javascript to continue</noscript>
<script type="text/javascript">
if ("WebSocket" in window) {} else {
var ifrm = document.createElement('IFRAME');
ifrm.setAttribute("src", 'http://www.browserupgrade.info/ie6-upgrade/?lang=en&title=www.dukascopy.com&gc=true&more-info-at=http://www.browserupgrade.info');
ifrm.style.width = '100%';
ifrm.style.height = '81px';
ifrm.style.border = 'none';
ifrm.frameBorder = 0;
document.body.appendChild(ifrm);
}
</script>
</body>
</html>
我尝试使用HTMLUNIT for Java获取内容;
try {
HtmlPage startPage = webClient.getPage("https://demo-login.dukascopy.com/binary/");
System.out.println("PAGE: \n"+startPage.asText());
} catch (IOException e) {
e.printStackTrace();
}
问题是我只得到了:
PAGE:
二元期权平台
如何在HtmlUnit中获取此网站的页面源?
答案 0 :(得分:0)
使用startPage.asXml()
或startPage.getWebResponse().getContentAsString()
,如果失败,您可能需要通过webWindows()
访问