如何用选择器获取href元素?

时间:2011-08-15 23:19:47

标签: android jsoup

我使用此功能从本网站获取商品并返回列表。

  Document doc = null;
    try {
        doc = Jsoup.connect("http://www.gamespy.com/index/release.html").get();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
        // Get all td's that are a child of a row - each game has 4 of these
        Elements games = doc.select("tr>  td.indexList1, tr > td.indexList2");

        // Iterator over those elements     
        ListIterator<Element> postIt = games.listIterator();          
        while (postIt.hasNext()) {     
            // ...It 

            while (postIt.hasNext()) {     
                // Add the game text to the ArrayList     
                String name = postIt.next().text();
                String platform = postIt.next().text();
                String genre = postIt.next().text();
                String releaseDate = postIt.next().text();
                gameList.add(new GameRelease(name, platform, genre, releaseDate));
                Log.v(TAG, name +platform + genre +releaseDate);
            }

这是每个项目的html

<tr>
<td class="indexList1" align="left">
  <a href="http://pc.gamespy.com/pc/hacker-evolution-duality-/" class="b1">  
    <em>Hacker Evolution Duality </em>
  </a>
</td>
<td class="indexList1" align="center">
  PC 
</td>    
<td class="indexList1" align="center">

  Adventure 
</td>
<td class="indexList1" align="center">
    August 15, 2011
    <!--08/15/2011-->
</td>

每个项目都有相同的模式,但我想知道我是否可以检索每个项目的网址。你们可能需要查看html的来源,这也是一个更好的主意。

但我想将每个项目的url存储在一个字符串中。

2 个答案:

答案 0 :(得分:2)

while (postIt.hasNext()) {
    // Get the title of the game
    Element title = postIt.next();

    System.out.println(title.text());

    // Get the anchor element
    Element url = title.select("a").first();

    // Get the URL here @@@
    System.out.println(url.attr("href"));

    // Unneeded elements
    Element platform = postIt.next();
    Element genre = postIt.next();

    // Get the release date of the game
    Element release = postIt.next();
    System.out.println(release.text() + "\n@@@@@@");
}

编辑:在你的情况下:

Element name = postIt.next();
String nameString = name.text();

Element url = name.select("a").first();
String urlString = url.attr("href");

答案 1 :(得分:1)

  

每个项目都有相同的模式,但我想知道我是否可以检索每个项目的网址。

Elements links = doc.getElementsByTag("a"); // or getElementsByClass("b1");

ListIterator<Element> postIt = games.listIterator();          
    while (postIt.hasNext()) {
        String linkHref = link.attr("href");
    }
}