Question

我正在尝试使用BeautifulSoup从超市获取一些产品数据。我需要提取多个h3标签内的信息，但我弄错了（我目前正在学习python，所以这可能是一个非常愚蠢的问题）。

我尝试使用find和findall，但是我在h3标记之间获取了文本，而不是其后的文本，这就是我需要的文本。

网页（https://shop.rewe.de/p/erdbaer-freche-freunde-apfel-banane-himbeere-100g/2388349）如下：

<div class="pdr-AttributeGroup  ">
     <div class="pdr-Attribute ">
          <h3 class="pdr-Attribute__label">
                "Marke"
                ":"
          </h3>
                "Erdbär"
          </div>
     <div class="pdr-Attribute ">
          <h3 class="pdr-Attribute__label">
                "Eigenschaften"
                ":" 
          </h3>
                "Vegan, Bio"
     </div>

我的代码是：

page = urllib.request.urlopen('https://shop.rewe.de/p/erdbaer-freche-freunde-apfel-banane-himbeere-100g/2388349')
soup = BeautifulSoup(page, 'html.parser')
for h3 in soup.findAll("h3"):
    print(h3)

我得到这些结果：

<h3 class="pdr-Attribute__label">Zutaten<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Aufbewahrungs- und Verwendungshinweis<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Marke<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Eigenschaften<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Ursprungsland<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Geschmacksrichtung<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Sorte<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Kontaktname<!-- -->: </h3>
<h3 class="pdr-Attribute__label">Kontaktadresse<!-- -->: </h3>

我想确切地知道“ Marke”，“ Eigenschaften”和“ Ursprungsland”（即“Erdbär”，“ Vegan，Bio”和“ Spanien”）之后的情况。预先谢谢你！

h3标签后如何获取文本？

0 个答案: