我试图获得一段HTML,例如:
<tr class="myclass-1234" rel="5678">
<td class="lst top">foo 1</td>
<td class="lst top">foo 2</td>
<td class="lst top">foo-5678</td>
<td class="lst top nw" style="text-align:right;">
<span class="nw">1.00</span> foo
</td>
<td class="top">01.05.2015</td>
</tr>
我对JSOUP来说是全新的,首先想到的是通过类名获取它,但事实是数字1234是动态生成的。有没有办法通过类名的一部分来获得它还是有更好的方法?
答案 0 :(得分:0)
doc.select("tr[class~=myclass.*]");
Will select any div where the content of theclass
attribute starts with myclass
.
答案 1 :(得分:0)
Assuming a simple html containing two tr, but only one tr has the class you mentioned, this code shows how to get the tr using CSS selector:
CSS selector tr[class^=myclass]
explained:
Select all elements of type "tr" with a class
attribute that starts (^) with myclass
:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
public class Example {
public static void main(String[] args) {
String html = "<html><body><table><tr class=\"myclass-1234\" rel=\"5678\">"
+ "<td class=\"lst top\">foo 1</td>"
+ "<td class=\"lst top\">foo 2</td>"
+ "<td class=\"lst top\">foo-5678</td>"
+ "<td class=\"lst top nw\" style=\"text-align:right;\">"
+ "<span class=\"nw\">1.00</span> foo"
+ "</td>"
+ "<td class=\"top\">01.05.2015</td>"
+ "</tr><tr><td>Not to be selected</td></tr></table></body></html>";
Document doc = Jsoup.parse(html);
Elements selectAllTr = doc.select("tr");
// Should be 2
System.out.println("tr elements in html: " + selectAllTr.size());
Elements trWithStartingClassMyClass = doc.select("tr[class^=myclass]");
// Should be 1
System.out.println("tr elements with class \"myclass*\" in html: " + trWithStartingClassMyClass.size());
System.out.println(trWithStartingClassMyClass);
}
}