Question

出于我的问题的目的，我创建了一个简单的HTML页面，其摘录如下：

<table class="fruit-vegetables">
  <thead>
    <th>Fruit</th>
    <th>Vegetables</th>
  </thead>
  <tbody>
    <tr>
      <td>
        <b>
          <a href="https://en.wikipedia.org/wiki/Apple" title="Apples">Apples</a>
        </b>
      </td>
      <td>
        <a href="https://en.wikipedia.org/wiki/Carrot" title="Carrots">Carrots</a>
      </td>
    </tr>
    <tr>
      <td>
        <i>
          <a href="https://en.wikipedia.org/wiki/Orange_%28fruit%29" title="Oranges">Oranges</a>
        </i>
      </td>
      <td>
        <a href="https://en.wikipedia.org/wiki/Pea" title="Peas">Peas</a>
      </td>
    </tr>
  </tbody>
</table>

我想使用Jsoup从名为“Fruit”的第一列中提取数据。因此，结果应该是：

Apples
Oranges

我写了一个程序，其摘录如下：

//In reality, it should be connect(html).get(). 
//Also, suppose that the String `html` has the full source code.
Document doc = Jsoup.parse(html); 

Elements table = doc.select("table.fruit-vegetables").select("tbody").select("tr").select("td").select("a");

for(Element element : table){
    System.out.println(element.text());
}

该计划的结果是：

Apples
Carrots
Oranges
Peas

我知道有些事情不好，但我找不到我的错误。 Stack Overflow中的所有其他问题都没有解决我的问题。我该怎么办？

Answer 1

您似乎在寻找

Elements el = doc.select("table.fruit-vegetables td:eq(0)");
for (Element e : el){
    System.out.println(e.text());
}

从http://jsoup.org/cookbook/extracting-data/selector-syntax，您可以找到:eq(n)的描述为

:eq(n)：找到其兄弟索引等于n的元素;例如form input:eq(1)

因此，对于td:eq(0)，我们选择了<td>，即其父的第一个孩子 - 在这种情况下为<tr>。

使用jsoup从表的第一列获取数据

1 个答案: