from bs4 import BeautifulSoup
import requests
r = requests.get('http://medicalassociation.in/doctor-search')
soup = BeautifulSoup(r.text,'lxml')
link = soup.find('table',{'class':'tab-gender'})
link1 = link.find('tbody')
link2 = link1.find('tr')[3:4]
link3 = link2.find('a',class_='user-name')
print link3.text
无法通过此代码获取链接。我想删除视图个人资料链接
答案 0 :(得分:0)
Request.get()
呈现javascript并看不到任何元素。您可以使用WebDriver
并获取page_source
,然后获取信息。
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("http://medicalassociation.in/doctor-search")
soup = BeautifulSoup(driver.page_source,'html.parser')
for a in soup.find_all('a',class_="user-name"):
if a.text is not None :
print(a['href'])
答案 1 :(得分:0)
以下内容适用于多个测试运行。只需将requests
和select
与类选择器一起使用。
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('http://medicalassociation.in/doctor-search')
soup = bs(r.content, 'lxml')
results = [item['href'] for item in soup.select(".user-name")]
print(results)