您好我是HtmlUnit的新手,我有一个项目,我想从一边获取一些信息,直到现在一切都顺利通过名称或ID找到元素。但我无法得到以下段落元素。
<iframe id="content_ifr" frameborder="0" src="javascript:""" allowtransparency="true" title=".." style="width: 100%; height: 307px; display: block;">
<!DOCTYPE >
<html>
<head> ... </head>
<body id="tinymce" class="mceContentBody content post-type-coupon wp-editor" contenteditable="true" onload="window.parent.tinyMCE.get('content').onLoad.dispatch();" dir="ltr">
<p>------ Text from the Element i want to get ------- </p>
</body>
</html>
</iframe>
我已经尝试过了:
side.getByXPath("//html/body/p");// zero elements
side.getByXpath("//p");// 27 element but wrong.
side.getByXpath("//body");// 1 element but wrong.
side.getByXpath("//html");// 1 element but wrong.
side.getByXpath("//html/body/div[3]/div[3]/div[2]/div/div[4]/form/div/div/div/div[2]/div/div[2]/span/table/tbody/tr[2]/td/iframe"); // Zero elements found
我检查了代码中找到的所有元素:
List<?> list =gPage.getByXPath("//p");
for(Object x:list){
HtmlElement y=(HtmlElement) x;
if(y.asXml().contains("Keyword")||y.asText().contains("Keyword")){
System.out.println(y.asText());
}
总而言之,我无法通过他的文本找到段落元素。你能帮我找到段落元素,这样我就能读/写吗?
//Initialize WebClient
final WebClient webClient= new WebClient(BrowserVersion.FIREFOX_24);
webClient.getCookieManager().setCookiesEnabled(true);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.waitForBackgroundJavaScript(10000);
//Perform a login.
final HtmlPage page = webClient.getPage("");
final HtmlForm form = page.getForms().get(1);
final HtmlTextInput username = form.getInputByName("log");
final HtmlPasswordInput pw = form.getInputByName("pwd");
username.setValueAttribute("");
pw.setValueAttribute("");
@SuppressWarnings("unused")
HtmlPage page2 = (HtmlPage) form.getButtonByName("login").click();
//Get gutscheinPage
HtmlPage gutscheinPage= webClient.getPage("");
//Change Content of Textfield
HtmlPage pageFrame = (HtmlPage) gutscheinPage.getFrames().get(0).getEnclosedPage();
HtmlElement body =pageFrame.getBody();
HtmlParagraph p =(HtmlParagraph) body.getByXPath("//p").get(0);
p.setTextContent(text);
完成:更改webClient默认浏览器并等待Jscript,使用getFrames,找到正文并使用现在简单的XPath为我提供我的段落元素。
我真的希望有人会发现这对他们自己的工作很有帮助。
感谢您的每一个答案。
答案 0 :(得分:2)
如您所见,它位于iframe
中。我想你需要先切换到框架中。
Here是您应该尝试的文档。
// untested Java code, please debug and read documentation yourself
final List<FrameWindow> window = page.getFrames();
final HtmlPage pageTwo = (HtmlPage) window.get(0).getEnclosedPage();
// then find TinyMCE's body, which should be treated as a separated HTML page