我的网页设置如下:
//a bunch of container divs....
<a class="food cat2 isotope-item" href="#" style="position: absolute; left: 45px; top: 0px;">
<div class="background"></div>
<div class="image">
<img src="/assets/score-images/cereal2.png" alt="">
</div>
<div class="score">1148</div>
<div class="name">Cereal with Banana</div>
</a>
<a class="food cat1 isotope-item" href="#" style="position: absolute; left: 215px; top: 0px;">
<div class="background"></div>
<div class="image">
<img src="/assets/score-images/burrito-all.png" alt="">
</div>
<div class="score">2257</div>
<div class="name">Beef & Cheese Burrito</div>
</a>
//hundreds more a tags....
</div>
我正在运行此代码以额外添加每个“a”属性的名称和分数。
page = requests.get('http://www.eatlowcarbon.org/food-scores')
from bs4 import BeautifulSoup
soup = BeautifulSoup(page.content, 'html.parser')
print('HEllO')
foodDict = {}
aTag = soup.findAll('a')
for tag in aTag:
print('HELLO 2')
name = tag.find("div", {"class": "name"}).text
score = tag.find("div", {"class": "score"}).text
foodDict[name] = score
print('hello')
两个打印语句都成功执行,所以第二个告诉我至少进入了for循环。但是,我收到了错误,
File "scrapeRecipe.py", line 40, in <module>
name = tag.find("div", {"class": "name"}).text
AttributeError: 'NoneType' object has no attribute 'text'
From this post,我假设我的代码没有找到任何类型等于“name”或类别的“div”的div。我是python的新手。有人有建议吗?
答案 0 :(得分:2)
问题不在于您的tag.find('div', ...)
,而在于您的soup.findAll('a')
。您正在提取每个a
标记,即使是那些没有子标记的标记,您也试图从
根据您的需求,您需要向class
添加findAll
aTag = soup.findAll('a', {'class': 'food'})