我正在尝试使用jsoup从下面的HTML代码中提取内容< td>具有类css-sched-table-title和css-sched-waypoints的标签。但我无法理解,有人可以帮忙解决问题吗?
Document doc = Jsoup.parse("somelink.html");
Elements row = doc.select(".css-sched-table-title td");
Iterator<Element> iterator = row.listIterator();
while(iterator.hasNext())
{
Element element = iterator.next();
String value = element.text();
System.out.println("value : " + value);
}
<tr>
<td ALIGN="CENTER" COLSPAN="16" CLASS="css-sched-table-title"><b>Saturday - </b><b>Afternoon</b></td>
</tr>
<tr VALIGN="BOTTOM">
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Townline and Southern</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Clearbrook and Blueridge</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Clearbrook and South Fraser</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Ar. Bourquin Exchange</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Lv. Bourquin Exchange</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Downtown Abbotsford</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">McMillan and Old Yale</TD>
<TD> </TD>
<TD ALIGN="CENTER" WIDTH="100" CLASS="css-sched-waypoints">Sandy Hill and Old Clayburn</TD>
</tr>
答案 0 :(得分:1)
td
个标记css-sched-table-title
,但css-sched-waypoints
列表。
此外,与Elements row = doc.select("td.css-sched-waypoints");
的正确语法对齐,请参阅here。
注意:html
文件按原样使用无效,jsoup
不会将其解释为有效的表格html内容。我必须将上面的内容括在<table></table>
代码中。
当我使用您的html
文件尝试以下代码时:
Elements row = doc.select("td.css-sched-waypoints");
Element title = doc.select("td.css-sched-table-title").first();
System.out.println(title.text());
Iterator<Element> iterator = row.listIterator();
while (iterator.hasNext()) {
Element element = iterator.next();
String id = element.attr("id");
String classes = element.attr("class");
String value = element.text();
System.out.println("Id : " + id + ", classes : " + classes
+ ", value : " + value);
}
我明白了,
Saturday - Afternoon
Id : , classes : css-sched-waypoints, value : Townline and Southern
Id : , classes : css-sched-waypoints, value : Clearbrook and Blueridge
Id : , classes : css-sched-waypoints, value : Clearbrook and South Fraser
Id : , classes : css-sched-waypoints, value : Ar. Bourquin Exchange
Id : , classes : css-sched-waypoints, value : Lv. Bourquin Exchange
Id : , classes : css-sched-waypoints, value : Downtown Abbotsford
Id : , classes : css-sched-waypoints, value : McMillan and Old Yale
Id : , classes : css-sched-waypoints, value : Sandy Hill and Old Clayburn