我试图从此website中删除房地产经纪人的姓名。
我的代码:
containers = page_soup.findAll("div",{"class":"team-details"})
for container in containers:
agent_name = container.findAll("a", {"class":"team-name_link"})
name = agent_name[0].text
print("name: " + name)
但是,当我运行脚本时,我只收到前两个名字,后跟一条错误消息:
name: Michael Stavrianos
name: Kristalla Stavrianos
Traceback (most recent call last):
File "C:\Users\Toby\Desktop\Webscrape\LjHooker - mark1.py", line 16, in <module>
name = agent_name[0].text
IndexError: list index out of range
我发现前两个代理名称属于“team-name_link”类,但其余的属于“team-name”类。我不确定如何同时从两组课程中删除名字。
答案 0 :(得分:2)
我认为你弄错了,所有名字都在所需的标签内,但你实际上需要寻找div
:
from bs4 import BeautifulSoup
import requests
html = requests.get("https://woollahra.ljhooker.com.au/our-team").text
soup = BeautifulSoup(html, 'html.parser')
containers = soup.findAll("div",{"class":"team-details"})
for container in containers:
agent_name = container.find("div", {"class":"team-name"})
name = agent_name.text
print(name)
以上代码输出:
Michael Stavrianos
Licensee
Kristalla Stavrianos
Principal
Jade Marshall
Property Management Associate
Emma Phelan
Property Management Associate
Isabella Marechal - Ross
Property Management Associate
Victoria Empson
Property Investment Manager