以下是从表中提取的行;
<table class="infobox vevent" style="width:22em">
<caption class="summary">Adobe Shockwave Player</caption>
<tr>
<td colspan="2" style="text-align:center"><a href="/wiki/File:Adobe_Shockwave_Player_logo.png" class="image"><img alt="Adobe Shockwave Player logo.png" src="//upload.wikimedia.org/wikipedia/en/thumb/8/8e/Adobe_Shockwave_Player_logo.png/64px-Adobe_Shockwave_Player_logo.png" width="64" height="64" srcset="//upload.wikimedia.org/wikipedia/en/thumb/8/8e/Adobe_Shockwave_Player_logo.png/96px-Adobe_Shockwave_Player_logo.png 1.5x, //upload.wikimedia.org/wikipedia/en/thumb/8/8e/Adobe_Shockwave_Player_logo.png/128px-Adobe_Shockwave_Player_logo.png 2x" data-file-width="165" data-file-height="165"></a></td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/Software_developer" title="Software developer">Original author(s)</a></th>
<td><a href="/wiki/Macromedia" title="Macromedia">Macromedia</a></td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/Software_developer" title="Software developer">Developer(s)</a></th>
<td><a href="/wiki/Adobe_Systems" title="Adobe Systems">Adobe Systems</a></td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/Software_release_life_cycle" title="Software release life cycle">Stable release</a></th>
<td>12.2.4.194 / 19 February 2016<span class="noprint">; 4 months ago</span><span style="display:none"> (<span class="bday dtstart published updated">2016-02-19</span>)</span><sup id="cite_ref-1" class="reference"><a href="#cite_note-1">[1]</a></sup></td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/Operating_system" title="Operating system">Operating system</a></th>
<td><a href="/wiki/Microsoft_Windows" title="Microsoft Windows">Microsoft Windows</a>, <a href="/wiki/Mac_OS_9" title="Mac OS 9">Mac OS 9</a>, <a href="/wiki/Mac_OS_X" class="mw-redirect" title="Mac OS X">Mac OS X</a> (Universal)</td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/Computing_platform" title="Computing platform">Platform</a></th>
<td><a href="/wiki/Web_browsers" class="mw-redirect" title="Web browsers">Web browsers</a></td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/List_of_software_categories" title="List of software categories">Type</a></th>
<td>Multimedia Player / <a href="/wiki/MIME" title="MIME">MIME</a> type: application/x-director</td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;"><a href="/wiki/Software_license" title="Software license">License</a></th>
<td><a href="/wiki/Proprietary_software" title="Proprietary software">Proprietary</a><sup id="cite_ref-2" class="reference"><a href="#cite_note-2">[2]</a></sup></td>
</tr>
<tr>
<th scope="row" style="white-space: nowrap;">Website</th>
<td><span class="url"><a rel="nofollow" class="external text" href="http://www.adobe.com/products/shockwaveplayer/">www<wbr>.adobe<wbr>.com<wbr>/products<wbr>/shockwaveplayer<wbr>/</a></span></td>
</tr>
</table>
我想要:
1。 td的文字“12.2.4.194”在特定文字“稳定释放”下。
2。 td的文本“Microsoft Windows”在特定文本“操作系统”下。
我坚持使用以下代码:
Document doc = Jsoup.connect("url").get();
for (Element table : doc.select("table.infobox")) {
String strName = table.getElementsByTag("caption").text();
if (strName.toLowerCase().contains("shockwave player")) {
Elements trow = table.select("tr");
System.out.println(trow);
}
}
答案 0 :(得分:1)
试试这个CSS查询:
table.infobox tr:has(a:containsOwn(Stable release)) > td,
table.infobox tr:has(a:containsOwn(Microsoft Windows)) > td
public static String getTDtext(Element table, String headerText) {
Element td = table.select("tr:has(a:containsOwn(" + headerText + ")) > td").first();
if (td==null) {
throw new RuntimeException("Unable to find text for " + headerText);
} else {
return td.ownText();
}
}
tr /* Select tr elements ... */
:has( /* ... having ... */
a /* ... an anchor element ... */
:containsOwn(headerText) /* ... containing headerText ... */
)
> td /* Select all td elements direct children */