如何使用Jsoup检索数组列表对象(FileName)并编写内容

时间:2014-12-08 07:12:18

标签: java jsoup

例如,我将两个值存储在数组列表Eaxmple link[0],link[1]Commentslink[0] commentslink[1]中,文件名为Filewrite[0],Filewrite[1]

我想检索Link[0]Commentslink[0]写入Filewrite[0]

我将文件名存储到Array List中,我希望检索文件名并使用jsoup写入文件。

如何使用数组列表(对象)文件名来编写对象值(Link)和(CommentsLink)?

我想要这样的输出:

要编写 head.txt

的文件名
Head, Shoulders, Knees and Toes | Popular Nursery Rhymes Collection for Kids | ChuChu TV Rhymes Zone

所有评论

要编写 twin.txt

的文件名
Twinkle Twinkle Little Star and Many More Videos | Popular Nursery Rhymes Collection by ChuChu TV

所有评论

我的代码:

ArrayList<String>link=new ArrayList<String>();
        link.add("https://www.youtube.com/watch?v=hNcSKJQfrfM");
        link.add("https://www.youtube.com/watch?v=Rj2QkLaaj2E");
        ArrayList<String>commentslink=new ArrayList<String>();
        commentslink.add("https://apis.google.com/u/0/wm/4/_/widget/render/comments?usegapi=1&first_party_property=YOUTUBE&href=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DhNcSKJQfrfM&owner_id=BnZ16ahKA2DZ_T5W0FPUXg&query=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DhNcSKJQfrfM&stream_id=UCBnZ16ahKA2DZ_T5W0FPUXg&substream_id=hNcSKJQfrfM&view_type=FILTERED&width=826&youtube_video_acl=PUBLIC&viewer_id=UCI7Gw-Kd1PkeBDQ7C33kN9A&hl=en_US&origin=https%3A%2F%2Fwww.youtube.com&search=%3Fv%3DhNcSKJQfrfM&hash=&gsrc=1p&jsh=m%3B%2F_%2Fscs%2Fabc-static%2F_%2Fjs%2Fk%3Dgapi.gapi.en.T4EayPRcrOA.O%2Fm%3D__features__%2Frt%3Dj%2Fd%3D1%2Frs%3DAItRSTOp4ORGjvVzLjTlu0PIOZx2FtcWuA#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh%2Confirsttimeplusonepromo%2Conthumbsup%2Contimestampclicked%2Conshareboxopen%2Conready%2Conallcommentsclicked&id=I0_1418016584228&parent=https%3A%2F%2Fwww.youtube.com&pfname=&rpctoken=10277320");
        commentslink.add("https://apis.google.com/u/0/wm/4/_/widget/render/comments?usegapi=1&first_party_property=YOUTUBE&href=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DRj2QkLaaj2E&owner_id=BnZ16ahKA2DZ_T5W0FPUXg&query=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DRj2QkLaaj2E&stream_id=UCBnZ16ahKA2DZ_T5W0FPUXg&substream_id=Rj2QkLaaj2E&view_type=FILTERED&width=826&youtube_video_acl=PUBLIC&viewer_id=UCI7Gw-Kd1PkeBDQ7C33kN9A&hl=en_US&origin=https%3A%2F%2Fwww.youtube.com&search=%3Fv%3DRj2QkLaaj2E&hash=&gsrc=1p&jsh=m%3B%2F_%2Fscs%2Fabc-static%2F_%2Fjs%2Fk%3Dgapi.gapi.en.T4EayPRcrOA.O%2Fm%3D__features__%2Frt%3Dj%2Fd%3D1%2Frs%3DAItRSTOp4ORGjvVzLjTlu0PIOZx2FtcWuA#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh%2Confirsttimeplusonepromo%2Conthumbsup%2Contimestampclicked%2Conshareboxopen%2Conready%2Conallcommentsclicked&id=I0_1418018057546&parent=https%3A%2F%2Fwww.youtube.com&pfname=&rpctoken=28730451");
        ArrayList<String>FileWrite=new ArrayList<String>();
        FileWrite.add("head.txt");
        FileWrite.add("twin.txt");
        BufferedWriter bw1 = new BufferedWriter(new FileWriter(
                "D:\\chu"));

        //retrieve from each link     
        int count=0;
        for(int k=0;k<FileWrite.size();k++){
         for(int i=0;i<link.size();i++){
            for(int j=0;j<commentslink.size();j++){

            Document doc,doc1,doc2;
            doc2=Jsoup.parse(FileWrite.get(k));
            doc=Jsoup.connect(link.get(i)).get();
            doc1=Jsoup.connect(commentslink.get(j)).get();

        System.out.println("Index:"+count);
        Elements title = doc.select("span[class=watch-title long-title]");
        System.out.println(title.text());
        bw1.write(title.text());
        bw1.newLine();
        Elements comments = doc1.select("div[class=yJa]").select("strong");
        System.out.println(comments.text());
        bw1.write(comments.text());
                }               }
        }
        bw1.close();
    }
    catch (Exception eo) {
        eo.printStackTrace();
    }

显示错误

java.lang.IllegalArgumentException: Malformed URL: head
at org.jsoup.helper.HttpConnection.url(HttpConnection.java:64)
at org.jsoup.helper.HttpConnection.connect(HttpConnection.java:30)
at org.jsoup.Jsoup.connect(Jsoup.java:73)
at SampleArrayListChu.main(SampleArrayListChu.java:47)
Caused by: java.net.MalformedURLException: no protocol: head
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at java.net.URL.<init>(Unknown Source)
at org.jsoup.helper.HttpConnection.url(HttpConnection.java:62)
... 3 more

1 个答案:

答案 0 :(得分:0)

我认为问题在于

doc2=Jsoup.connect(FileWrite.get(k)).get();

我对JSoup不太熟悉,但据我所知,方法connect()需要一个URL作为参数,但你传递一个字符串:FileWrite[0]返回"head"字符串。

正如stacktrace所说:java.lang.IllegalArgumentException:格式错误的URL:head

这意味着参数不是有效的URL。