如何在beautifulsoup中的多个div中找到所需的数据

时间:2016-03-09 13:43:06

标签: python django beautifulsoup

这是html代码 我想在多个div标签中选择数据

<div class="details-wrapper apps-secondary-color">
    <div class="details-section metadata">
        <div class="details-section-heading">
         <div class="details-section-contents">
             <div class="meta-info">
                 <div class="title">Updated</div>
                 <div class="content" itemprop="datePublished">March 7, 2016</div>
                 </div>
                 <div class="meta-info">
                 <div class="meta-info">
                 <div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info meta-info-wide">
<div class="details-sharing-section">
</div>
<div class="details-section-divider"></div>
</div>
</div>
</div>

我想选择2016年3月7日 我怎样才能在beautifulsoup中选择它?

1 个答案:

答案 0 :(得分:1)

您可以soup.find('div', {'itemprop': 'datePublished'})使用div itemprop选择datePublished元素。

<强>演示

from bs4 import BeautifulSoup

content = '''<div class="details-wrapper apps-secondary-color">
    <div class="details-section metadata">
        <div class="details-section-heading">
         <div class="details-section-contents">
             <div class="meta-info">
                 <div class="title">Updated</div>
                 <div class="content" itemprop="datePublished">March 7, 2016</div>
                 </div>
                 <div class="meta-info">
                 <div class="meta-info">
                 <div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info meta-info-wide">
<div class="details-sharing-section">
</div>
<div class="details-section-divider"></div>
</div>
</div>
</div>'''

soup = BeautifulSoup(content)
date = soup.find('div', {'itemprop':'datePublished'})
print(date.text)

<强>输出

March 7, 2016