<p class="name">
<a href="/shop/view.php?index_no=22176&cate="><strong class="title displaynone"> :</strong>T-shritsT</a> <span class="icon"></span></p>
<ul class="xans-element- xans-product xans-product-listitem">
<li class=" xans-record-"><strong class="title displaynone"><span style="font-size:12px;color:#555555;">price</span> :</strong> <span style="font-size:12px;color:#555555;"><s></s>$20</span></li>
在这段代码中,我想只获取文字“T-shrits”,价格为“$ 20”,不含“:”和“价格”
这是我的代码,
Elements goods = document.select("p.name > a");
for (Element e :goods) {
System.out.println("------------------------------------------");
System.out.println("goods" + e.text()); }
答案 0 :(得分:0)
试试这个:
public class Test {
public static void main(String[] args) {
String s="<p class=\"name\">\n" +
"<a href=\"/shop/view.php?index_no=22176&cate=\"><strong class=\"title displaynone\"> :</strong>T-shritsT</a> <span class=\"icon\"></span></p>\n" +
"<ul class=\"xans-element- xans-product xans-product-listitem\">\n" +
"<li class=\" xans-record-\"><strong class=\"title displaynone\"><span style=\"font-size:12px;color:#555555;\">price</span> :</strong> <span style=\"font-size:12px;color:#555555;\"><s></s>$20</span></li>";
Document document= Jsoup.parse(s);
document.select("strong").remove();
Whitelist whitelist = Whitelist.basic();
System.out.println(Jsoup.parse(Jsoup.clean(document.toString(), whitelist)).text());
}
}
输出:
T-shritsT $20