我有一个要求,我需要通过使用jsoup传递特定的th及其相应的td值来从doc中获取tr。我可以使用contains()获取单独的td和th。但有没有办法可以测试任何特定的td值是否与该值相对应。例如我有以下HTML
<tr>
<th>id</th>
<th>name</th>
</tr>
<tr>
<td>11</td>
<td>ABC</td>
</tr>
这里我需要根据2个参数找到tr-th和td。如果我通过名字和ABC,它应该取给我完整的tr。如果我传递的参数对不匹配,即名称和DEF,它不应该返回tr,因为名称col不具有DEF值。
答案 0 :(得分:1)
您想使用Node#siblingIndex()
方法。
首先,我们将确定匹配“ name ”的th
元素的兄弟索引。
使用下面的CSS选择器找到th
:
th:containsOwn(thValue)
然后我们查找tr
元素,其中td
元素具有相同的兄弟索引并包含值“ ABC ”。使用以下CSS选择器重新开始:
table tr:has(td:containsOwn(tdValue):eq(thSiblingIndex))
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
/**
*
* Returns the first tr element with all following requirements:
* - Has a th element with own text containing thValue
* - Has a td element with own text containing tdValue
* - The matching th and td elements msut have the same sibling index.
*
* @param doc The document to search in
* @param thValue The value in the th element
* @param tdValue The value in the td element
* @return The first matching tr element.
* @throws RuntimeException if no tr element can be found.
*
*/
public static Element findFirstTR(Document doc, String thValue, String tdValue) {
Element th = doc.select("th:containsOwn(" + thValue + ")").first();
if (th == null) {
throw new RuntimeException("Unable to find th element containing: " + thValue);
}
Element tr = th.parents().select("table tr:has(td:containsOwn(" + tdValue + "):eq(" + th.siblingIndex() + "))").first();
if (tr == null) {
throw new RuntimeException("Unable to find tr element matching: thValue=" + thValue + " and tdValue=" + tdValue);
}
return tr;
}
String html = "<table><tr><th>id</th><th>name</th></tr><tr><td>11</td><td>ABC</td></tr></table>";
Document doc = Jsoup.parse(html);
Element tr = findFirstTR(doc, "name", "ABC");
System.out.println(tr.outerHtml());
<tr>
<td>11</td>
<td>ABC</td>
</tr>
答案 1 :(得分:1)
经过一些实验,我想出了这个;它在表头(th
)中找到具有指定文本的元素,如果它存在且其他指定文本存在于表内的正确位置,则返回整个表行。
private Elements fetchCompleteTr (Document doc, String tableHeaderName, String tableValue) {
Elements tableHeaders = doc.select("th:containsOwn(" + tableHeaderName + ")"); //find the table header
if (tableHeaders.isEmpty()) {
return null; //the header was not found in the table
}
int thElementIndex = tableHeaders.first().elementSiblingIndex();
Elements tableRows = doc.select("tr:has(td:eq(" + thElementIndex + "):containsOwn(" + tableValue + "))");
if (tableRows.isEmpty()) {
return null; //the value for the specified table header does not exist.
} else {
return tableRows;
}
}
这是一个关于如何使用它的测试和一些演示:
System.out.println("With fetchCompleteTr, \"name\", \"ABC\"):");
System.out.println(fetchCompleteTr(doc, "name", "ABC"));
System.out.println("With fetchCompleteTr(doc, \"name\", \"XYZ\"):");
System.out.println(fetchCompleteTr(doc, "name", "XYZ"));
System.out.println("With fetchCompleteTr(doc, \"id\", \"11\"):");
System.out.println(fetchCompleteTr(doc, "id", "11") );
打印哪些:
With fetchCompleteTr(doc, "name", "ABC"):
<tr>
<td>11</td>
<td>ABC</td>
</tr>
With fetchCompleteTr(doc, "name", "XYZ"):
null (because no "name" with "XYZ" in the table exists)
With fetchCompleteTr(doc, "id", "11"):
<tr>
<td>11</td>
<td>ABC</td>
</tr>
如果要将其与多个表一起使用,可以像下面这样进行修改:
private Elements fetchCompleteTr (Element table, String tableHeaderName, String tableValue) {
Elements tableHeaders = table.select("th:containsOwn(" + tableHeaderName + ")"); //find the table header
if (tableHeaders.isEmpty()) {
return null; //the header was not found in the table
}
int thElementIndex = tableHeaders.first().elementSiblingIndex();
Elements tableRows = table.select("tr:has(td:eq(" + thElementIndex + "):containsOwn(" + tableValue + "))");
if (tableRows.isEmpty()) {
return null; //the value for the specified table header does not exist.
} else {
return tableRows;
}
}
然后像这样使用它:
for (Element e: myDocument.select("table")) {
With fetchCompleteTr(e, "name", "XYZ");
}
这样您就可以搜索文档中的所有表格。
请注意,我没有对此进行过广泛测试,因此可能包含错误。它看起来也很复杂,但我想不出更短/更好的东西。