Jsoup如何选择html代码的这一部分

时间:2015-09-01 00:31:36

标签: android jsoup

以下是我的代码,我正在尝试使用 jsoup 从网址提取图片链接。 html中有多个图像;这只是html和java代码的一部分。

这是HTML代码

<div class="listing-content">
  <h2 class="listing-title" itemprop="name">Gta 5 Xbox 360</h2>
  <p class="listing-description
hide-fully-to-m" itemprop="description">Selling gta 5 for xbox360 perfect condition collection only in the oldham
    area</p>
  <ul class="listing-attributes inline-list hide-fully-to-m">
  </ul>
  <div class="listing-location" itemscope itemtype="http://schema.org/Place">
    <span class="truncate-line" itemprop="name"> Oldham, Manchester </span>
  </div>
  <strong class="listing-price txt-emphasis" itemprop="price">£8</strong> <strong class="listing-posted-date txt-normal truncate-line"
    itemprop="adAge"> <span class="hide-visually">Ad posted </span> 8 days ago
  </strong>
</div>
</a>
<span class="save-ad listing-save-ad" data-savead="channel:savead-1130547161"> <span class="hide-visually">Save this ad</span> <span
  class="icn-star iconu-m txt-quaternary" aria-hidden="true"></span>
</span>
</article>
</li>
<li>
  <article class="listing-maxi" itemscope itemtype="http://schema.org/Product" data-q=ad-1130434474>
    <a class="listing-link" href="/p/video-games/f1-2015-xbox-one/1130434474" itemprop="url">
      <div class="listing-side">
        <div class="listing-thumbnail ">
          <img src="" data-lazy="https://ssli.ebayimg.com/00/s/NTM3WDQyNQ==/z/1j8AAOSwMmBV2XTL/$_26.JPG" alt="" itemprop="image"
            class="hide-fully-no-js" />
          <noscript>
            <img src="https://ssli.ebayimg.com/00/s/NTM3WDQyNQ==/z/1j8AAOSwMmBV2XTL/$_26.JPG" alt="" itemprop="image" />
          </noscript>
        </div>
        <div class="listing-meta">
          <ul class="inline-list txt-center">
            <li>1<span class="hide-visually"> images</span> <span class="icn-camera txt-quaternary" aria-hidden="true"></span>
            </li>
          </ul>
        </div>
      </div>

这是我用来遍历图像的代码:

Elements imagess = doc.select("img.data-lazy");
String[] imgg = new String[imagess.size()];


for (int i = 0; i < imagess.size(); i++) {
   imgg[i] = imagess.get(i).attr("abs:src");
}

似乎无法让它发挥作用。

1 个答案:

答案 0 :(得分:2)

这样的事情可以做到

Elements imagess = doc.select("img"); //Select all images
String[] imgg = new String[imagess.size()];

for (int i = 0; i < imagess.size(); i++) {
    Element img = imagess.get(i);
    if(img.hasAttr("data-lazy")) {
        imgg[i] = img.absUrl("data-lazy"); //If the image has the attribute data-lazy, then take the url from there
    } else {
        imgg[i] = img.absUrl("src"); //If not then take the url from src.
    }
    System.out.println(imgg[i]);
}

如果您只想选择具有属性data-lazy的图像,请使用此

doc.select("img[data-lazy]") //Select all images that have the attribute data-lazy