使用JSOUP解析作者

时间:2015-10-28 17:53:59

标签: java html parsing html-parsing

有我的html输入:

  <!-- left panel --> 
  <div class="left-panel"> 
    <p class="article-published"> 1. júl 2015 o 17:35 &nbsp;&nbsp; Marek Hudec, Dávid Tvrdoň </p>
  </div>

和代码:

if(doc.select("p[class=article-published]").isEmpty() == FALSE){
    Elements description = doc.select("p[class=article-published]");
    for (Element link : description) {
        author4 = link.text();
    }
    System.out.println("AUTHORS :" + author4);
 }

我想获得输出,例如:Marek Hudec,DávidTvrdoň。所以只有那些家伙的名字。但我无法得到它。请有人帮帮我。谢谢

1 个答案:

答案 0 :(得分:0)

您所要做的就是解析从Jsoup获得的文本并从中删除您想要的数据,在下面的代码中我修改了您的代码以获取特定索引的数据。

 import java.util.Arrays;
 import org.jsoup.Jsoup;
 import org.jsoup.nodes.Document;
 import org.jsoup.nodes.Element;
 import org.jsoup.select.Elements;

 public class KolosParsor {   
        public static void main(String[] args) {
            String author4 = null;
            Document doc = Jsoup.parse("<div class=\"left-panel\">"+ 
             "<p class=\"article-published\"> 1. júl 2015 o 17:35 &nbsp;&nbsp; Marek Hudec,Dávid Tvrdoň </p>");
            if(!doc.select("p[class=article-published]").isEmpty()){
                Elements description = doc.select("p[class=article-published]");
                for (Element link : description) {
                     author4 = link.text();
                 }
                 System.out.println("DATA :" + Arrays.asList(author4.split(" ")));
                 System.out.println("AUTHORS :" + Arrays.asList(author4.split(" ")).get(7));
             }          
        }
    }