我正在尝试通过给定的邮政编码获取所有治疗师的地址。我想输入一个邮政编码并获取结果列表。然后,进入单个结果并刮擦提供者的地址。
我是python的新手。我一直在尝试使用请求和BeautifulSoup。也许使用硒可能更好?
declaration: true
我现在被困住了。不知道如何进行。 PS。我正在讲的是python课程。请客气。
答案 0 :(得分:1)
尝试一下,您将通过给定的邮政编码获得所有治疗师的地址:
但是,如果您要获取该地址的所有页面,则此列表仅提供1页编号的地址列表,那么您应该使用硒,这样可以解决您的问题。
import requests
from bs4 import BeautifulSoup
from bs4.element import Tag
url = 'https://www.psychologytoday.com/us/therapists/60148'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.text, 'html.parser')
result = soup.find(class_='results-column')
addressArray = []
for tag in result:
if isinstance(tag,Tag):
_class = tag.get("class")
if _class is None or _class is not None and "row" not in _class:
continue
link = (tag.find(class_='result-actions')).find('a',href=True)
_href = link['href']
address_link = requests.get(_href, headers=headers)
soup1 = BeautifulSoup(address_link.text, 'html.parser')
address = (soup1.find(class_='address')).find(class_="location-address-phone")
text = ''
for index,data in enumerate((address.text.strip()).split('\n')):
if not data.strip():
continue
if not text:
text = data.strip()
else:
text = text+","+data.strip()
if text:
addressArray.append(text)
print(addressArray)
O / P:
['Lia Reynolds, LCSW,Lombard, Illinois 60148,(630) 343-5819', 'Clarity Counseling and Wellness, LLC,477 Butterfield Road,#202,Lombard, Illinois 60148,(630) 656-9713', '450 East 22nd St.,Suite 172,Lombard, Illinois 60148,(773) 599-3959', '10 E 22nd Street,Suite 217,Lombard, Illinois 60148,(630) 517-9505', 'Ron Ahlberg & Associates,477 E Butterfield Rd,Suite 310,Lombard, Illinois 60148,(630) 451-8653', 'Health Transitions Counseling,477 Butterfield Road,Suite 310,Lombard, Illinois 60148,(630) 785-6642', 'Way Beyond Counseling and Coaching,477 E Butterfield Road,Floor 3 - Wellness Center - Office 7,Lombard, Illinois 60148,Call Mr. Larry Westenberg,(630) 556-8484', 'Chicago Area Behavioral Health Services,150 W St Charles Road,Lombard, Illinois 60148,Call Augustus Edeh. Chicago Area Behavioral Health Services,(630) 599-8032', 'Adult Children Center, Ltd,2 East 22nd Street,Suite 302,Lombard, Illinois 60148,(630) 387-9750', 'Midwest Center for Hope & Healing, Ltd.,1165 S Westmore-meyers Rd,Lombard, Illinois 60148,(630) 765-5355', 'Madrigal Consulting and Counseling, LLP,450 E. 22nd Street,Suite 150,Lombard, Illinois 60148,Call Cesar Madrigal,(630) 413-9942', '477 E Butterfield Rd,Suite 202,Lombard, Illinois 60148,(630) 560-6920', 'Lombard,Lombard, Illinois 60148,(630) 796-7904', 'Dupage Clinical Counseling Services,450 E 22nd St,150,Lombard, Illinois 60148,(630) 313-4990', '2200 S Main St,Suite 316,Lombard, Illinois 60148,(630) 426-7819', 'Institute for Motivational Development,10 E 22nd Street, Suite 217,Lombard, Illinois 60148,(309) 723-8170', 'Michele DeCanio Counseling Services,2200 S. Main Street,Suite 305,Lombard, Illinois 60148,(630) 560-6926', 'A New Day Counseling Center,450 E 22nd St,Suite 150,Lombard, Illinois 60148,(630) 748-8261', '477 E Butterfield Rd,Suite 310,Lombard, Illinois 60148,(630) 426-6878', 'Bricolage Wellness,477 Butterfield Road,Suite 202,Lombard, Illinois 60148,(630) 426-7823']
其中'result-actions'
是用于打开新页面的操作视图按钮类,因此需要再次提出请求以获取完整地址。
"location-address-phone"
是要替换地址的新地址页类别。
文档链接: