如何使用JSoup来获取特定值?

时间:2014-09-02 19:40:25

标签: java spring-mvc jsoup

我正在尝试解析html文本以获取关键字后的特定值。有关以下代码的示例:

<table>

    <tr>
        <td class="odd">TW-Central</td>
        <td class="odd">$3.8600</td>
        <td class="odd">$3.8600</td>
        <td class="odd">$3.8600</td>
        <td class="odd red">-0.0168</td>
        <td class="odd right">42,500</td>
        <td class="odd right">7</td>
    </tr>



    <tr>
        <td class="even">Waha</td>
        <td class="even">$3.9600</td>
        <td class="even">$3.8800</td>
        <td class="even">$3.9196</td>
        <td class="even red">-0.0436</td>
        <td class="even right">69,500</td>
        <td class="even right">17</td>
    </tr>



    <tr>
        <td class="odd">White River Hub</td>
        <td class="odd">$3.8200</td>
        <td class="odd">$3.7975</td>
        <td class="odd">$3.8088</td>
        <td class="odd red">-0.0184</td>
        <td class="odd right">81,200</td>
        <td class="odd right">13</td>
    </tr>

</table>

在找到关键字Waha之后,我如何能够获取其下的价格并将其返回? 任何帮助将非常感谢。我也使用STS在Java中编写代码,如果JSoup不是最好的,那么使用的建议也将非常感激!谢谢!

1 个答案:

答案 0 :(得分:0)

如果表格不会改变它的位置,只需获取所有td元素,然后使用get(index)方法选择你想要的那个。

    StringBuilder html = new StringBuilder();
    html.append("  <table>");
    html.append("    <tr>");
    html.append("     <td class=\"even\">Waha</td>");
    html.append("     <td class=\"even\">$3.9600</td>");
    html.append("     <td class=\"even\">$3.8800</td>");
    html.append("    </tr>");
    html.append("  </table>");

    Document document = Jsoup.parse(html.toString());
    Elements tdElements = document.select("td");
    String waha = tdElements.get(0).text();
    String firstPrice = tdElements.get(1).text();
    String secondPrice = tdElements.get(2).text();

    System.out.println("The first td content is: " + waha);
    System.out.println("The second td content (firstPrice) is: " + firstPrice);
    System.out.println("The third td content (secondPrice) is: " + secondPrice);

更新

要动态选择,请使用以下代码:

@Test
public void testJSOUP() {
    StringBuilder html = new StringBuilder();
    html.append("  <table>");
    html.append("    <tr>");
    html.append("     <td class=\"odd\">TW-Central</td>");
    html.append("     <td class=\"odd\">$3.9600</td>");
    html.append("     <td class=\"odd\">$3.8800</td>");
    html.append("    </tr>");
    html.append("    <tr>");
    html.append("     <td class=\"even\">Waha Row</td>");
    html.append("     <td class=\"even\">$4.9600</td>");
    html.append("     <td class=\"even\">$5.8800</td>");
    html.append("    </tr>");
    html.append("    <tr>");
    html.append("     <td class=\"odd\">White River Hub</</td>");
    html.append("     <td class=\"odd\">$4.9600</td>");
    html.append("     <td class=\"odd\">$5.8800</td>");
    html.append("    </tr>");
    html.append("  </table>");

    Document document = Jsoup.parse(html.toString());
    Elements trElements = document.select("tr");
    for (Element tableRows : trElements) {
        Elements tdElements = tableRows.select("td");
        String articleName = tdElements.get(0).text();
        String firstPrice = tdElements.get(1).text();
        String secondPrice = tdElements.get(2).text();

        System.out.println("The article: " + articleName + "has price one:" + firstPrice + " and price two:" + secondPrice);
    }
}

这将创建以下输出

  

文章:TW-Centralhas价格一:3.9600和价格二:$ 3.8800
   文章:Waha Rowhas价格一:4.9600和价格二:$ 5.8800
   文章:白河Hubhas价格一:4.9600美元和价格二:$ 5.8800