Question

我想要做的是获取一个链接，在这种情况下是一个webm文件，并将其存储在一个字符串中。在查看页面源时，我抓取的页面是http://www.hearthpwn.com/cards/503-ragnaros-the-firelord，我想要的链接在第1010行。我希望这种方法可以在不同的页面上工作，所以我不想逐行扫描。如果有人能给我一个小例子，只是为了开始如何只刮掉与＆＃34; data-animationurl =＆＃34;相关联的链接。这很棒，谢谢

Answer 1

你想要将它包装在AsyncTask中，这样你的应用就不会挂起，但这应该会给你一个良好的开端：

您可以获得有关jsoup here.

的更多信息

try {
    //Connect to the url, and set the user agent so we don't get blocked out
    Connection connect = Jsoup.connect("http://www.hearthpwn.com/cards/503-ragnaros-the-firelord");
    connect.userAgent("Mozilla/5.0");

    //Get the html and select the first <video class="hscard-video" ...
    Document doc = connect.get();
    Element video = doc.select("video.hscard-video").first();

    //Grab all the data from it as a map (ex. data-href, data-usegold...)
    Map<String, String> dataSet = video.dataset();

    //If data-animationurl exists, print it (here you can store it as a String instead 
    if(dataSet.containsKey("animationurl")){
        System.out.println(dataSet.get("animationurl"));
    }
} catch (IOException e) {
    e.printStackTrace();
}

用jsoup抓取特定元素。

1 个答案: