如何在表格体中找到特定的tr
标签?例如,请考虑以下事项:
<table cellspacing="1" cellpadding="3" class="tablehead">
<tbody>
<tr class="stathead">...</tr>
<tr class="colhead">...</tr>
<tr class="oddrow team-23-2046">...</tr>
<tr class="evenrow team-22-1234">...</tr>
<tr class="oddrow team-25-2326">...</tr>
<tr class="evenrow team-25-2262">...</tr>
</tbody>
</table>
我需要所有“oddrow”和“evenrow”标签,但不需要“stathead”或“colhead”。我能用切片做这样的事情:
for data in soup.find_all("table", {"class": "tablehead"}):
for row in data.find_all('tr')[2:]:
print(row.text)
但我并不总是确定我正在废弃该内容的每个页面都会有这种格式,所以我宁愿明确搜索“oddrow / evenrow”。每个页面的团队编号也不同,所以我不能将这些数字用于完全匹配。
答案 0 :(得分:2)
你可以试试这个:
soup.find("table", {"class": "tablehead"}).find_all("tr", {"class": ["oddrow", "evenrow"]})
实施例:
soup = BeautifulSoup("""<table cellspacing="1" cellpadding="3" class="tablehead">
<tbody>
<tr class="stathead">...</tr>
<tr class="colhead">...</tr>
<tr class="oddrow team-23-2046">...</tr>
<tr class="evenrow team-22-1234">...</tr>
<tr class="oddrow team-25-2326">...</tr>
<tr class="evenrow team-25-2262">...</tr>
</tbody>
</table>""", "html.parser")
soup.find("table", {"class": "tablehead"}).find_all("tr", {"class": ["oddrow", "evenrow"]})
#[<tr class="oddrow team-23-2046">...</tr>,
# <tr class="evenrow team-22-1234">...</tr>,
# <tr class="oddrow team-25-2326">...</tr>,
# <tr class="evenrow team-25-2262">...</tr>]