bs4中的.findall函数仅适用于某些HTML标记。我正在尝试抓取一个网站。
from bs4 import BeautifulSoup
import requests
url = 'https://bitskins.com/'
page_response = requests.get(url, timeout=5)
page_content = BeautifulSoup(page_response.content, 'html.parser')
# Gather the two lists
skin_list = page_content.find_all('div', attrs={'class': 'panel-heading item-title'})
wear_box = page_content.find_all('div', attrs={'class': 'text-muted text-center'})
当我打印skin_list时,它可以成功工作,但是当我尝试打印磨损列表时,它会打印一个空列表。
我尝试了另一件事:
wear_box = page_content.html.search("Wear: {float}")
这带来了一个错误,指出“ NoneType”对象不可调用。
我正在使用Sublime Text 3。
答案 0 :(得分:0)
from bs4 import BeautifulSoup
import requests
url = 'https://bitskins.com/'
page_response = requests.get(url, timeout=5)
page_content = BeautifulSoup(page_response.content, 'html.parser')
skin_list = page_content.findAll('div', class_ = 'panel item-featured panel-default')
for skin in skin_list:
name = skin.find("div", class_ = "panel-heading item-title")
price = skin.find("span", class_ = "item-price hidden")
discount = skin.find("span", class_ = "badge badge-info")
wear = skin.find("span", class_ = "hidden unwrappable-float-pointer")
print("name:", name.text)
print("Price", price.text)
print("Discount:", discount.text)
# Choose which one you want
for w in wear.text.split(","):
print("Wear:", w)
您正在尝试查找错误的类。我添加了一些其他数据,您可以将其作为示例。 Wear保留了我输出的一些值。
答案 1 :(得分:0)
在代码行中,您正在搜索包含具有多个值的类的标签。
wear_box = page_content.find_all('div', attrs={'class': 'text-muted text-center'})
在页面上唯一适合的标签是:
<div class="container text-center text-muted" style="padding-top: 17px;">
在BS4中,当您搜索具有多个值的属性时,您将搜索单个值,例如:
wear_box = page_content.find_all('p', attrs={'class': 'text-muted'})
或者您必须搜索确切的价目表列表,例如:
wear_box = page_content.find_all('div', attrs={'class': 'container text-center text-muted'})