我有跟随html,我试图弄清楚我是如何告诉BeautifulSoup在某些html元素之后提取td。在这种情况下,我想在<td>
<td>Color Digest</td>
中获取数据
<tr>
<td> Color Digest </td>
<td> 2,36,156,38,25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, </td>
</tr>
这是整个HTML
<html>
<head>
<body>
<div align="center">
<table cellspacing="0" cellpadding="0" style="clear:both; width:100%;margin:0px; font-size:1pt;">
<br>
<br>
<table>
<table>
<tbody>
<tr bgcolor="#AAAAAA">
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<tr>
<td> Color Digest </td>
<td> 2,36,156,38,25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, </td>
</tr>
</tbody>
</table>
答案 0 :(得分:4)
听起来您需要遍历<td>
列表并在找到数据后停止。
示例:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('<html><tr><td>X</td><td>Color Digest</td><td>THE DIGEST</td></tr></html>')
for cell in soup.html.tr.findAll('td'):
if 'Color Digest' == cell.text:
print cell.nextSibling.text