Question

我正在尝试专门废弃网页this web page。我试图废弃产品名称，但不知怎的，我的find_all方法无法正常工作，也找不到我指定的所有标签。

所以这就是我正在做的事情

from bs4 import BeatifulSoup

url = 'https://www.toysrus.fi/nallet-ja-pehmolelut/interaktiiviset-pehmolelut'
soup = BeautifulSoup(request.urlopen(url).read(), 'html.parser')
print(len(soup.findAll('div', {'class' : 'inner-wrapper'})))

class='inner-wrapper'的长度在指定页面中实际上是4，但它只找到1.请指导从网页中删除产品名称以及如何获得{{1}的正确标记数量拥有div。感谢。

Answer 1

美味汤只能找到合适的html div标签，那些碰巧在脚本里面的标签会被忽略。遗憾的是，美丽的汤不会评估脚本。只需打开HTML代码，您将看到一个类的HTML div，以及一堆脚本/ js-templates，如下所示

<script type="text/x-jsrender" id="product-list-skuid-template">
  <div class="product-list-component type-{{:TemplateInfo.type}} outer-wrapper">
    <div class="inner-wrapper">
      <ul class="product-list-container">
        {{for Data}} {{include tmpl="#product-template"/}} {{/for}}
      </ul>
    </div>
  </div>
  {{!-- SHADOW --}} {{if TemplateInfo.divider=='roundshadow'}}
  <div class="round-shadow"></div>
  {{else TemplateInfo.divider=='simple'}}
  <hr /> {{/if}}
</script>

Beautifulsoup find_all没有找到所有标签Python

1 个答案: