我有以下html
<tr>
<td colspan="6" class="sumheadtop"> Friday 18 March 2016</td>
</tr>
<tr>
<td colspan="6" class="sumheadbot"> PASSENGER ARRIVALS | DOMESTIC & INTERNATIONAL | All Airlines | ALL OriginS</td>
</tr>
<tr class="schedulerow" style="height:2px"><td colspan="6"></td></tr>
<tr class="schedulerow" valign="top">
<td class="airline"><img src="/webfids/images/3u.gif" width="100" height="24" vspace="0" alt="Sichuan Airlines"/></td>
<td class="flight" nowrap>3U 8989</td>
<td class="city">Chengdu</td>
<td class="time">19:00</td>
<td class="estimated">20:01</td>
<td class="status"><div class="statusone">LANDED</div></td>
</tr>
<tr class="schedulerow" style="height:2px"><td colspan="6"></td></tr>
<tr class="schedulerowtwo" style="height:2px"><td colspan="6"></td></tr>
<tr class="schedulerowtwo" valign="top">
<td class="airline"><img src="/webfids/images/q2.gif" width="100" height="24" vspace="0" alt="Maldivian"/></td>
<td class="flight" nowrap>Q2 107</td>
<td class="city">Gan</td>
<td class="time">19:35</td>
<td class="estimated">19:30</td>
<td class="status"><div class="statusone">LANDED</div></td>
</tr>
<tr>
<td colspan="6" class="sumheadtop"> Saturday 19 March 2016</td>
</tr>
<tr>
<td colspan="6" class="sumheadbot"> PASSENGER ARRIVALS | DOMESTIC & INTERNATIONAL | All Airlines | ALL OriginS</td>
</tr>
<tr class="schedulerow" style="height:2px"><td colspan="6"></td></tr>
<tr class="schedulerow" valign="top">
<td class="airline"><img src="/webfids/images/3u.gif" width="100" height="24" vspace="0" alt="Sichuan Airlines"/></td>
<td class="flight" nowrap>3U 8989</td>
<td class="city">Chengdu</td>
<td class="time">19:00</td>
<td class="estimated">20:01</td>
<td class="status"><div class="statusone">LANDED</div></td>
</tr>
<tr class="schedulerow" style="height:2px"><td colspan="6"></td></tr>
<tr class="schedulerowtwo" style="height:2px"><td colspan="6"></td></tr>
<tr class="schedulerowtwo" valign="top">
<td class="airline"><img src="/webfids/images/q2.gif" width="100" height="24" vspace="0" alt="Maldivian"/></td>
<td class="flight" nowrap>Q2 107</td>
<td class="city">Gan</td>
<td class="time">19:35</td>
<td class="estimated">19:30</td>
<td class="status"><div class="statusone">LANDED</div></td>
</tr>
我希望得到两个&#34; sumheadtop&#34;之间的行。类。
我如何使用Jsoup
实现这一目标我尝试使用下面的代码但是我得到了第一个&#34; sumheadtop&#34;以下的所有行类
doc = Jsoup.parse(html);
date = doc.select("td[class=sumheadtop]");
siblings = date.first().parent().siblingElements();
答案 0 :(得分:0)
试试这个:
String html = "<table><tr>\n <td colspan=\"6\" class=\"sumheadtop\"> Friday 18 March 2016</td>\n</tr>\n<tr>\n <td colspan=\"6\" class=\"sumheadbot\"> PASSENGER ARRIVALS | DOMESTIC & INTERNATIONAL | All Airlines | ALL OriginS</td>\n</tr>\n<tr class=\"schedulerow\" style=\"height:2px\"><td colspan=\"6\"></td></tr>\n<tr class=\"schedulerow\" valign=\"top\">\n <td class=\"airline\"><img src=\"/webfids/images/3u.gif\" width=\"100\" height=\"24\" vspace=\"0\" alt=\"Sichuan Airlines\"/></td>\n <td class=\"flight\" nowrap>3U 8989</td>\n <td class=\"city\">Chengdu</td>\n <td class=\"time\">19:00</td>\n <td class=\"estimated\">20:01</td>\n <td class=\"status\"><div class=\"statusone\">LANDED</div></td>\n</tr>\n<tr class=\"schedulerow\" style=\"height:2px\"><td colspan=\"6\"></td></tr>\n<tr class=\"schedulerowtwo\" style=\"height:2px\"><td colspan=\"6\"></td></tr>\n<tr class=\"schedulerowtwo\" valign=\"top\">\n <td class=\"airline\"><img src=\"/webfids/images/q2.gif\" width=\"100\" height=\"24\" vspace=\"0\" alt=\"Maldivian\"/></td>\n <td class=\"flight\" nowrap>Q2 107</td>\n <td class=\"city\">Gan</td>\n <td class=\"time\">19:35</td>\n <td class=\"estimated\">19:30</td>\n <td class=\"status\"><div class=\"statusone\">LANDED</div></td>\n</tr>\n<tr>\n <td colspan=\"6\" class=\"sumheadtop\"> Saturday 19 March 2016</td>\n</tr>\n<tr>\n <td colspan=\"6\" class=\"sumheadbot\"> PASSENGER ARRIVALS | DOMESTIC & INTERNATIONAL | All Airlines | ALL OriginS</td>\n</tr>\n<tr class=\"schedulerow\" style=\"height:2px\"><td colspan=\"6\"></td></tr>\n<tr class=\"schedulerow\" valign=\"top\">\n <td class=\"airline\"><img src=\"/webfids/images/3u.gif\" width=\"100\" height=\"24\" vspace=\"0\" alt=\"Sichuan Airlines\"/></td>\n <td class=\"flight\" nowrap>3U 8989</td>\n <td class=\"city\">Chengdu</td>\n <td class=\"time\">19:00</td>\n <td class=\"estimated\">20:01</td>\n <td class=\"status\"><div class=\"statusone\">LANDED</div></td>\n</tr>\n<tr class=\"schedulerow\" style=\"height:2px\"><td colspan=\"6\"></td></tr>\n<tr class=\"schedulerowtwo\" style=\"height:2px\"><td colspan=\"6\"></td></tr>\n<tr class=\"schedulerowtwo\" valign=\"top\">\n <td class=\"airline\"><img src=\"/webfids/images/q2.gif\" width=\"100\" height=\"24\" vspace=\"0\" alt=\"Maldivian\"/></td>\n <td class=\"flight\" nowrap>Q2 107</td>\n <td class=\"city\">Gan</td>\n <td class=\"time\">19:35</td>\n <td class=\"estimated\">19:30</td>\n <td class=\"status\"><div class=\"statusone\">LANDED</div></td>\n</tr></table>";
Document doc = Jsoup.parse(html);
Element firstDateCell = doc.select("td.sumheadtop").first();
if (firstDateCell == null) {
throw new RuntimeException("Unable to locate rows...");
}
System.out.println(firstDateCell.text());
for (Element aRow : firstDateCell.parent().siblingElements()) {
if (!aRow.select("td.sumheadtop").isEmpty()) {
System.out.println(aRow.text());
} else {
// Handle the row now...
System.out.println(">> " + aRow.text());
}
}
Friday 18 March 2016
>> PASSENGER ARRIVALS | DOMESTIC & INTERNATIONAL | All Airlines | ALL OriginS
>>
>> 3U 8989 Chengdu 19:00 20:01 LANDED
>>
>>
>> Q2 107 Gan 19:35 19:30 LANDED
Saturday 19 March 2016
>> PASSENGER ARRIVALS | DOMESTIC & INTERNATIONAL | All Airlines | ALL OriginS
>>
>> 3U 8989 Chengdu 19:00 20:01 LANDED
>>
>>
>> Q2 107 Gan 19:35 19:30 LANDED
但是,上面的代码显示Saturday 19 March 2016
之后的行。您可以添加一个中断来防止这种情况发生。