Jsoup解析(正则表达式)

时间:2017-02-21 09:22:05

标签: java jsoup

请问您如何使用jsoup提取这些消息:

  

资源/音频/歌曲/ 73742facb924e6.mp3

     

未来就是

     绿色灰色

来自此代码:

<div class="playlist-item" 
    id="playlist-item-2" 
    data-song-id="8365" 

    onmouseover="javascript: pageUtils.playlistItemSharebar(2);"
    onclick="jwPlayerUtils.playSong(8365, 2, event); _gaq.push(['_setAccount', 'UA-1091709-7']); _gaq.push(['_trackPageview']);">

    <input type="hidden" id="song-path-8365" value="resources/audio/songs/73742facb924e6.mp3" />
    <input type="hidden" id="song-mode-8365" value="song" />
    <input type="hidden" id="song-name-left-8365" value="Future Is Now" />
    <input type="hidden" id="song-name-right-8365" value="Green Grey" />
    <input type="hidden" id="song-programName-8365" value="" />
        <input type="hidden" id="song-img-8365" value="resources/img/songs/98x74_DIR/8365.jpg?201702161231" />

这是我已经尝试过的:

 Document document = Jsoup.connect(url).execute().parse();

        Elements elements = document.select(".playlist-item");
        for (Element element : elements) {
            System.out.println("Artist: " + element.select("[id~=^song-name-right-[0-9]+$]").select("[value]"));
            System.out.println("Song: " + element.select("[id~=^song-name-left-[0-9]+$]").select("[value]"));;
            System.out.println("Link: " + element.select("[id~=^song-path-[0-9]+$]").select("[value]"));
        }

但结果是完整的字符串:

input type =“hidden”id =“song-path-8365”value =“resources / audio / songs / 73742facb924e6.mp3”/&gt; input type =“hidden”id =“song-name-left-8365”value =“Future Is Now”/&gt; input type =“hidden”id =“song-name-right-8365”value =“Green Grey”/&gt;

我确信还有其他(更简单,更正确)的方式来应对这种情况,所以会感激任何帮助

1 个答案:

答案 0 :(得分:1)

Elements inputs = document.select("input");
for(Element input:inputs){
    System.out.println(input.attr("value"));
}