我正在尝试在此网站上重新创建网络抓取
https://medium.freecodecamp.org/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe
我在jupyter中工作是第一个项目,但我想出了这个错误
AttributeError:'NoneType'对象没有属性'text'
我尝试过更改链接,但这没有什么区别。我真的不了解该问题。这是到目前为止的所有代码...
#import the libraries
import urllib.request
from bs4 import BeautifulSoup
# specify the url
quote_page = "https://www.bloomberg.com/quote/SP1:IND"
page = urllib.request.urlopen(quote_page)
# parse the html using BeautifulSoup and store in variable `soup`
soup = BeautifulSoup(page, "html.parser")
# Take out the <div> of name and get its value
name_box = soup.find("h1", attrs={"class": "name"})
name = name_box.text.strip()
# strip() is used to remove starting and trailing
print (name)
# get the index price
price_box = soup.find("div", attrs={"class":"price"})
price = price_box.text.strip()
print (price)
任何帮助将不胜感激
答案 0 :(得分:0)
我使用硒进行网络抓取,但是我相信我可以帮助您(也许)。
本节是您的代码给您的错误我假定:
price_box = soup.find("div", attrs={"class":"price"})
price = price_box.text.strip()
print (price)
我要做的是:
price_box = soup.find("div", attrs={"class":"price"})
price = price_box().text
print (price.text)