如何从包含有<span>的<div>文本中获取漂亮的汤

时间:2018-11-01 14:09:44

标签: python python-3.x beautifulsoup

这是HTML代码

<div class="mxjs-variant-selector mx-variant-selector"
     data-ga-label=""
     name="" >Title <br>
              <span class="mx-price">Price</span>
</div>

我想使用各种变量来获取TitlePrice

这是我的代码

name_box = soup.find('div', attrs={'class': 'mxjs-variant-selector mx-variant-selector'})
title = name_box.text.strip()
name_box1 = soup.find("div", class_="mxjs-variant-selector mx-variant-selector").find("span", class_="mx-price").text
price = name_box1

关于标题我得到

Title
(with newline)
Price

我得到的价格

Price

1 个答案:

答案 0 :(得分:0)

首先从span元素中获取文本,然后从汤中将其删除。然后,您可以获取title

from bs4 import BeautifulSoup

html = """<div class="mxjs-variant-selector mx-variant-selector"
     data-ga-label=""
     name="" >Title <br>
              <span class="mx-price">Price</span>
</div>"""

soup = BeautifulSoup(html, "html.parser")

div = soup.find("div", class_="mxjs-variant-selector mx-variant-selector")
price = div.span.text
div.span.extract()
title = div.get_text(strip=True)

print(title)
print(price)

给你

Title
Price