尝试获取元素“a”和“span”的值。使用HTMLCleaner。
<div class="info">
<p class="name">
<a href="http://www.zxdv.com/level/1/film/616/sr/1/">Tron</a>
<span class="year">2001</span>
</p>
</div>
这是代码:
TagNode linkElements[] = rootNode.getElementsByName("div", true);
int s=0;
for (int i = 0; linkElements != null && i < linkElements.length; i++)
{
if (linkElements[i].getAttributes().toString().equals("{class=info}")) {
TagNode linkElements2[] = linkElements[i].getElementsByName("p", true);
for (int i2 = 0; linkElements2 != null && i2 < linkElements2.length; i2++)
{
TagNode linkElements3[] = linkElements2[i2].getElementsByName("a", true);
TagNode linkElements4[] = linkElements2[i2].getElementsByName("span", true);
for (int i3 = 0; linkElements3 != null && i3 < linkElements3.length; i3++)
{
if (s <= 20) {
String currentlink = linkElements3[i3].getText().toString();
String currentlink2 = linkElements4[i3].getText().toString();
slink[s] = currentlink+"\n"+currentlink2;
s++;
}
}
}
}
}
因为我理解第一个“div”元素,然后他的子元素“p”,但当我taling“a”和“span”元素值返回emptyю 请提示我哪里弄错了。感谢
答案 0 :(得分:4)
改为使用XPath减少工作
TagNode root = htmlCleaner.clean(url);
// Xpath to 'a'
Object[] foundList = root.evaluateXPath("//div/p[@class='name']/a");
if(foundList == null || foundList.length < 1) {
return;
}
TagNode aNode = (TagNode)foundList[0];
String aNodeTextContent = aNode.getText();
// Xpath to 'span'
foundList = root.evaluateXPath("//div/p[@class='name']/span");
if(foundList == null || foundList.length < 1) {
return;
}
TagNode spanNode = (TagNode)foundList[0];
String spanNodeTextContent = spanNode.getText();