我想从这个link使用beautifulSoup获得学校名称“Perkins College ......”。
我使用的代码什么都不返回。
school = soup.find('a','profiles-show-school-name-sm-link')
print 'school: ', school
print 'school.text: ', school.text
输出:
school: <a class="profiles-show-school-name-sm-link" href="/profiles/show/online-degrees/stephen-f-austin-state-university/perkins-college-of-education-undergraduate/395/5401">
<img border="0" src="/images/profiles/243x60/4613/degrees/undergraduate-certificate-in-hospitality-administration.png"/>
</a>
school.text:
建议使用BeautifulSoup实现提取学校名称(不是URL)? THX!
答案 0 :(得分:1)
school = soup.find('a','profiles-show-school-name-sm-link')
url = school['href']
假设学校总是在网址中的同一位置:
for i in range(5):
url = url[url.find("/")+1:]
schoolname = url[:url.find("/")]
print " ".join(schoolname.split("-")).title()
收率:
Perkins College Of Education Undergraduate
获得大学
for i in range(4):
url = url[url.find("/")+1:]
university= url[:url.find("/")]
print " ".join(university.split("-")).title()
收率:
Stephen F Austin State University