使用jsoup访问类元素

时间:2014-04-05 04:23:04

标签: html dom jsoup

这是html文件:

<!DOCTYPE html>
    <html lang="en">
    <head>
    <meta charset="UTF-8" />
    <title>Title</title>
    </head>
    <body>

    <h1>Demo</h1>
     <div class="eta">
        <h2>Text</h2>
        <h2 class="strike">Text1</h2>
        <div class="del">
          <p>Text2</p>
        </div>
        <p class="desc">Text3</p>
      </div>
    </body>
    </html>

我想访问class="eta"的第一个元素Text。我写了以下代码:

public static void main(String[] args) {
        Document doc;
        Document doc1;
        try {

            File input = new File("/path/sample.html");
            doc1 = Jsoup.parse(input, "UTF-8");

            Elements details2 = doc1.getElementsByClass("eta");
            String status2 = details2.first().text();
            System.out.println(status2);


        } catch (IOException e) {
            e.printStackTrace();  
        }
    }

该程序输出以下内容: Text Text1 Text2 Text3

然而,我想只提取文本。我怎么能这样做?

1 个答案:

答案 0 :(得分:1)

Elements divs = doc1.select("eta");
Element firstDiv = divs.get(0);
System.out.println(firstDiv.text());