我正在尝试使用htmlparser库解析HTML字符串。 html是这样的:
<body>
<div class="Level1">
<div class="row">
<div class="txt">
Date of analysis:
</div><div class="content">
02/03/11
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Site:
</div><div class="content">
13.0E
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Network type:
</div><div class="content">
DVB-S
</div>
</div>
</div>
</body>
我需要提取给定“txt”的“内容”信息。我做了一个过滤器,返回带有class =“level1”的div,但我不知道如何使用div的内容制作过滤器,我的意思是如果txt的值是Site:那么读取内容如13.0即
NodeList nl = parser.extractAllNodesThatMatch(new AndFilter(new TagNameFilter("div"), new HasAttributeFilter("class", "Level1")));
有人可以帮我解决这个问题吗?如何在div中读取div? 谢谢!
答案 0 :(得分:0)
NodeList nl = parser.extractAllNodesThatMatch(new AndFilter(new TagNameFilter("div"), new HasAttributeFilter("class", "Level1")));
最好这样做:
NodeList nl = parser.parse(null); // you can also filter here
NodeList divs = nl.extractAllNodesThatMatch(
new AndFilter(new TagNameFilter("DIV"),
new HasAttributeFilter("class", "txt")));
if( divs.size() > 0 ) {
Tag div = divs.elementAt(0);
String text = div.getText(); // this is the text of the div
}