使用Jsoup在android中选择文本

时间:2013-10-11 10:15:35

标签: android jsoup

我正在使用以下代码从段落标记中选择数据(因为它从粗体标记中跳过数据),并且它正常工作。

public class MyTask extends AsyncTask<String, Void, String>{

        public String data;
        public String url;

        @Override
    protected String doInBackground(String ... params){

        data = "";
        url = params[0];       

        try{
             Document doc = Jsoup.connect(url).get();
             Elements e = doc.select("p");
             for (Element element : e){
                  if(element!=null){
                       data+=element.ownText();
                       data+='\n';
                       data+='\n';
                  }
             }
        } 
        catch(Exception e){
             // print stack trace
        }
        return data;
    }

我必须从段落标记中选择数据,该标记也包含粗体标记。现在,如何选择整个数据,而不跳过粗体标记的数据(段落和粗体标记的数据)。

2 个答案:

答案 0 :(得分:0)

来源无效HTML。

要获得您想要的内容,您必须选择<body>标记并检索其文本。

    String data = "";
    Document doc = Jsoup.connect("http://www.nahjulbalagha.org/SermonDetail.php?Sermon=1").get();
    Element e = doc.select("body").first();
    data += e.text();

答案 1 :(得分:0)

使用Jsoup你必须知道,(你的结果)是一个DOM文档。你的标签

是分层较低的,它是表格数据的一部分“

<td> <!-- InstanceBeginEditable name="EditRegion1" --> 

                <b>Creation of Earth and Sky and the birth of Adam. </b><hr><br><p>Praise is due to Allah whose worth cannot be described by speakers, whose bounties cannot be counted by calculators and whose claim (to obedience) cannot be satisfied by those who attempt to do so, whom the height of intellectual courage cannot appreciate, and the divings of understanding cannot reach; He for whose description no limit has been laid down, no eulogy exists, no time is ordained and no duration is fixed.<p>

He brought forth creation through His Omnipotence, dispersed winds through His Compassion, and made firm the shaking earth with rocks. <p>"

, 因此,如果您还想要b标记值和p标记值,请选择标记td:

Elements e = doc.select("td"); 

谢谢