Question

这是我尝试获取数据的链接flipkart

和代码的一部分：

   <div class="toolbar-wrap line section">
   <div class="ratings-reviews-wrap">
      <div itemprop="aggregateRating" itemscope="" itemtype="http://schema.org/AggregateRating" class="ratings-reviews line omniture-field">
         <div class="ratings">
            <meta itemprop="ratingValue" content="1">
            <div class="fk-stars" title="1 stars">
               <span class="unfilled">★★★★★</span>
               <span class="rating filled" style="width:20%">
               ★★★★★
               </span>
            </div>
            <div class="count">
               <span itemprop="ratingCount">2</span>
            </div>
         </div>
      </div>

  </div>

</div>

在这里，我必须从title= 1 star获取1颗星，从<span itemprop="ratingCount">2</span>

获取2颗星

我尝试以下代码

 x = link_soup.find_all("div",class_='fk-stars')[0].get('title')

 print x, " product_star"
 y = link_soup.find_all("span",itemprop="ratingCount")[0].string.strip()
 print y

但它给出了

IndexError：列表索引超出范围

Answer 1

您在浏览器中看到的内容实际上并不存在于从this URL检索到的原始HTML中。

当加载浏览器时，页面执行AJAX调用以加载其他内容，然后将其动态插入到页面中。其中一个电话获得您所追求的评级信息。具体来说，this URL是包含作为“操作栏”插入的HTML的那个。

但是如果你使用Python检索主页面，例如与requests，urllib等。也就是说，动态内容未加载，这就是BeautifulSoup无法找到标签的原因。

您可以分析主页面以查找实际链接，检索该链接，然后通过BeautifulSoup运行它。该链接看起来像是以/p/pv1/spotList1/spot1/actionBar开头，因此或actionBar足以找到实际链接。

或者您可以使用selenium加载页面，然后抓取并处理呈现的HTML。

IndexError：使用bs4时列出索引超出范围

1 个答案: