我如何使用<tr> tags inside <tbody>
循环中的find_all
访问for
,因为每个<tr>
似乎彼此独立并且具有替代类'even'
和'odd'
。我只能在find_all
中传递两个参数。即find_all('tr', class_='odd')
或(even)
此外,我如何仅分别访问第1个,第3个,第4个和第6个。 标签没有ID或类。
[from bs4 import BeautifulSoup
import requests
src_code = requests.get('https://bschool.careers360.com/colleges/ranking/2018').text
soup = BeautifulSoup(src_code, features="html.parser")
i = 1
for trr in soup.find_all('tr', class_='odd'):
i+=1
college = trr.td.a.text
print(college)
if i%2==0:
class_='even'
else:
class_='odd'][1]
答案 0 :(得分:4)
您可以先找到父标签。
from bs4 import BeautifulSoup
import requests
src_code = requests.get('https://bschool.careers360.com/colleges/ranking/2018').content
soup = BeautifulSoup(src_code, features="html5lib")
trs=soup.find(name = "div",id="related-results").find_all(name = "tr")
trs
trs是您想要的:
[<tr><th>College Name</th><th>Rank</th><th>Overall Score</th><th>Rating</th><th>Ownership</th><th>Intake Exams</th><th></th></tr>,
<tr class="odd"><td><a href="https://www.careers360.com/university/indian-institute-of-management-ahmedabad">Indian Institute of Management Ahmedabad</a><br/></td><td><span class="serialNum circlerate Government"></span><span class="rankStyle">1</span></td><td><span class="overall_scoredata">427.92</span></td><td>AAAAA<div class="rankInfo"> <strong>2017 Rating: </strong> AAAAA</div></td><td><div class="ownership_name">Government</div><div class="rating_review rankInfo"><strong>User Rating: </strong>4.7 / 5</div></td><td><div class="showMoreCheck"> <input type="checkbox"/><div class="ranked_best_branch intakeExam"><div class="intakeExam ng-binding"><span class="best_branch plusMinus">CAT</span><ul><li>GMAT</li></ul></div></div></div></td><td><div class="rank-apply-button btnBlockInfo"><div class="flagging" id="divid-7057"><div class="flag-link flag-default-link"><a class="buttonDefault follow iframe-popup-button" href="/user/register?destination=colleges/ranking/2018&nid=7057&flag=bookmarks&click_location=follow_button&popup=iframe">Follow</a></div></div><div class="client_url"></div></div><div class="college-compare-checkbox combine-rating-block smallclListing"> <label> <input class="tmCheckbox" name="college_ranking" type="checkbox" value="7057"/><span></span> <i>Compare</i> </label></div></td></tr>,
<tr class="even"><td><a href="https://www.careers360.com/university/indian-institute-of-management-bangalore">Indian Institute of Management Bangalore</a><br/></td><td><span class="serialNum circlerate Government"></span><span class="rankStyle">2</span></td><td><span class="overall_scoredata">408.32</span></td><td>AAAAA<div class="rankInfo"> <strong>2017 Rating: </strong> AAAAA</div></td><td><div class="ownership_name">Government</div><div class="rating_review rankInfo"><strong>User Rating: </strong>4.1 / 5</div></td><td><div class="showMoreCheck"> <input type="checkbox"/><div class="ranked_best_branch intakeExam"><div class="intakeExam ng-binding"><span class="best_branch plusMinus">CAT</span><ul><li>GMAT</li></ul></div></div></div></td><td><div class="rank-apply-button btnBlockInfo"><div class="flagging" id="divid-6872"><div class="flag-link flag-default-link"><a class="buttonDefault follow iframe-popup-button" href="/user/register?destination=colleges/ranking/2018&nid=6872&flag=bookmarks&click_location=follow_button&popup=iframe">Follow</a></div></div><div class="client_url"></div></div><div class="college-compare-checkbox combine-rating-block smallclListing"> <label> <input class="tmCheckbox" name="college_ranking" type="checkbox" value="6872"/><span></span> <i>Compare</i> </label></div></td></tr>,
<tr class="odd"><td><a href="https://www.careers360.com/university/indian-institute-of-management-calcutta">Indian Institute of Management Calcutta</a><br/></td><td><span class="serialNum circlerate Government"></span><span class="rankStyle">3</span></td><td><span class="overall_scoredata">375.18</span></td><td>AAAAA<div class="rankInfo"> <strong>2017 Rating: </strong> AAAAA</div></td><td><div class="ownership_name">Government</div><div class="rating_review rankInfo"><strong>User Rating: </strong>4.9 / 5</div></td><td><div class="showMoreCheck"> <input type="checkbox"/><div class="ranked_best_branch intakeExam"><div class="intakeExam ng-binding"><span class="best_branch plusMinus">GMAT</span><ul><li>CAT</li></ul></div></div></div></td><td><div class="rank-apply-button btnBlockInfo"><div class="flagging" id="divid-6933"><div class="flag-link flag-default-link"><a class="buttonDefault follow iframe-popup-button" href="/user/register?destination=colleges/ranking/2018&nid=6933&flag=bookmarks&click_location=follow_button&popup=iframe">Follow</a></div></div><div class="client_url"></div></div><div class="college-compare-checkbox combine-rating-block smallclListing"> <label> <input class="tmCheckbox" name="college_ranking" type="checkbox" value="6933"/><span></span> <i>Compare</i> </label></div></td></tr>,
......
答案 1 :(得分:0)
find_all("tr",class_=['odd','even'])
这将获取所有tr标签,然后获取带有标签的td标签和标签文本
from bs4 import BeautifulSoup
import requests
src_code = requests.get('https://bschool.careers360.com/colleges/ranking/2018').text
soup = BeautifulSoup(src_code, features="html.parser")
alltr=soup.find_all("tr",class_=['odd','even'])
for x in alltr:
print(x.td.a.text)