我试图从运行代码返回的h1标记中提取文本,但未收到任何输出。但是,该代码能够找到指定的标签,如下所示:
List(34, 11, 98, 56, 43).zipWithIndex.minOption.map(_._2).getOrElse(-1)
// Int = 1
List[Int]().zipWithIndex.minOption.map(_._2).getOrElse(-1)
// Int = -1
网页链接:
<h1 class="product-name main-heading">Mixed Brown Rice 2.5kg</h1>
这是我使用的代码:
https://giantonline.com.sg/product/mixed-brown-rice-5142760
为什么我无法从driver.get("https://giantonline.com.sg/product/mixed-brown-rice-5142760")
driver.implicitly_wait(30)
time.sleep(4)
bs2=BeautifulSoup(driver.page_source, 'lxml')
for z in bs2.find_all('div',class_="col-md-5 col-sm-5 col-xs-12"):
try:
name = z.find('h1',class_='product-name')
print(type(name))
print(name)
name = name.get_text(seperator=' ')
print(name)
size = z.find('h1',class_='product-size main-heading')
size = size.text
oldprice = z.find('div',class_='old-price')
oldprice = oldprice.text
price = z.find('div',class_='content_price')
price = price.text
except:
continue
标记中提取文本?
答案 0 :(得分:0)
这应该可行,请检查此代码。
from bs4 import BeautifulSoup
html = """<h1 class="product-name main-heading">Mixed Brown Rice 2.5kg</h1>"""
soup = BeautifulSoup(html, 'html.parser')
Title = soup.find('h1', attrs={'class':'product-name'}).text
print(Title)
输出:
Mixed Brown Rice 2.5kg