我想从具有类似网址的多个网站中搜集,例如https://woollahra.ljhooker.com.au/our-team,https://chinatown.ljhooker.com.au/our-team和https://bondibeach.ljhooker.com.au/our-team。
我已经编写了一个适用于第一个网站的脚本,但我不确定如何告诉它从其他两个网站中删除。
我的代码:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = "https://woollahra.ljhooker.com.au/our-team"
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div", {"class":"team-details"})
for container in containers:
agent_name = container.findAll("div", {"class":"team-name"})
name = agent_name[0].text
phone = container.findAll("span", {"class":"phone"})
mobile = phone[0].text
print("name: " + name)
print("mobile: " + mobile)
有没有办法可以简单地列出url的不同部分(woollahra,chinatown,bondibeach),以便脚本使用我已编写的代码遍历每个网页?
答案 0 :(得分:2)
locations = ['woollahra', 'chinatown', 'bondibeach']
for location in locations:
my_url = 'https://' + location + '.ljhooker.com.au/our-team'
后面是代码的其余部分,它将查看列表中的每个元素,以后可以添加更多位置
答案 1 :(得分:2)
你只想要一个循环
for team in ["woollahra", "chinatown", "bondibeach"]:
my_url = "https://{}.ljhooker.com.au/our-team".format(team)
page_soup = soup(page_html, "html.parser")
# make sure you indent the rest of the code