如何根据结构

时间:2017-01-10 21:04:07

标签: java parsing

我想从xbox的许多链接中提取一些数据。我遇到的问题是,在显示价格的部分中,如果游戏具有折扣(例如),则结构不同。

我编写的代码废弃了价格:

String urlPage = "https://www.microsoft.com/en-us/store/p/call-of-duty-advanced-warfare-gold-edition/c20hl06x0v8w" ;
        System.out.println("Comprobando entradas de: "+urlPage);

        if (getStatusConnectionCode(urlPage) == 200) {

            Document document = getHtmlDocument(urlPage);

            Elements entradas = document.select("div.m-product-detail-hero-product-placement div.price-info");

            for (Element elem : entradas) {
                String titulo = elem.getElementsByClass("srv_saleprice").text();
                }


        }else{
            System.out.println("El Status Code no es OK es: "+getStatusConnectionCode(urlPage));
        }

没有折扣的游戏的HTML:

URL for first case

<div class="price-info"> 
 <div class="c-price"> 
  <div class="price-text srv_price"> 
   <div class="ea-vault-message hidden x-hidden"> 
    <div>
     Available in The Vault
    </div> 
    <div>
     or
    </div> 
   </div> 
   <span>$59.99</span> 
   <sup>+</sup> 
  </div> 
  <div class="srv_microdata" itemprop="offers" itemscope itemtype="http://schema.org/Offer"> 
   <meta itemprop="price" content="59.99"> 
   <meta itemprop="priceCurrency" content="USD"> 
  </div> 
 </div> 
</div>

对于有折扣的游戏:

URL for the second case

<div class="price-info"> 
 <div class="c-price"> 
  <div class="price-text srv_price"> 
   <div class="ea-vault-message hidden x-hidden"> 
    <div>
     Available in The Vault
    </div> 
    <div>
     or
    </div> 
   </div> 
   <s class="srv_saleprice" aria-label="Full price was $159.99">$159.99</s> 
   <span>&nbsp;</span> 
   <div class="price-disclaimer"> 
    <span>$135.99</span> 
    <sup>+</sup> 
   </div> 
   <span>&nbsp;</span> 
   <span></span> 
  </div> 
  <div class="caption text-muted srv_countdown"> 
   <span class="sub">save $24.00</span> 
  </div> 
  <div class="srv_microdata" itemprop="offers" itemscope itemtype="http://schema.org/Offer"> 
   <meta itemprop="price" content="135.99"> 
   <meta itemprop="priceCurrency" content="USD"> 
  </div> 
 </div> 
</div>

在第二个示例中,元素内部的值为135.99美元,但不是游戏基本价格(在这种情况下为159.99美元)。

我如何只提取每个游戏(有或没有)折扣的基本价格?

0 个答案:

没有答案