Question

我想从给定表中的特定行中提取文本，例如：

<table>
   <th> head1 </th>
   <th> head2 </th>
   <tr> <td> cell1 </td> <td> cell2 </td> </tr>
   <tr> <td> cell3 </td> <td> cell4 </td> </tr>
</table>

通过在Java中使用Jsoup，如何仅在该表中提取第1行的内容。所需的输出如下：

cell1, cell2

我尝试了以下代码，但它打印了我不喜欢的标题行：

    Element table = doc.getElementsByTag("table").first();
    Elements trs = table.getElementsByTag("tr");
    for (Element tr : trs) {
        for (Element td : tr.getAllElements()) {
            System.out.println("TD: " + td.text());
             ....

Answer 1

尝试这种方式：

Elements tdsInSecondRow = doc.select("table tr:eq(1) > td");
for (Element td : tdsInSecondRow)
{
    System.out.println("TD: " + td.text());
}

要理解选择器允许我将其分为3个部分：

表格 - 选择表格
tr：eq（1） - 从中选择第二个（0索引）tr
＆GT; td - 并从中选择tds是tr

为了使它能够使用循环，设置一个布尔标志或计数器来确定执行何时在循环的第一次迭代中，并在这种情况下继续，如下所示：

boolean isFirstIteration = true;
for (Element tr : trs) {
    if (isFirstIteration) {
        isFirstIteration = false;
        continue;
    }
    else {
        for (Element td : tr.getAllElements()) { ... }
    }
 }

如果您使用计数器，则可以采取每第2行或第3行。

使用JSOUP打印特定的行

1 个答案: