使用findAll后,BeautifulSoup返回下一个兄弟(text ='')

时间:2013-06-07 10:46:47

标签: python html html-parsing beautifulsoup

在通过使用soup.findAll搜索HTML找到我想要的内容后,如何使用bs4获取下一个兄弟?

<td class="name">David<span class="flag away"</span>
</td>
    <td class="team">b<span class="team b"></span></td>
    <td class="time">99'</td>

<td class="name">James<span class="flag home"</span>
</td>
    <td class="team">a<span class="team a"></span></td>
    <td class="time">99'</td>

使用find all我可以找到文本

for t in soup.findAll(text='David'):
>> David

然而,我希望的结果是

<td class="team">b<span class="team b"></span></td>
<td class="time">99'</td>

1 个答案:

答案 0 :(得分:6)

from bs4 import BeautifulSoup as soup, Tag


input = """<td class="name">David<span class="flag away"</span>
</td>
    <td class="team">b<span class="team b"></span></td>
    <td class="time">99'</td>

<td class="name">James<span class="flag home"</span>"""

web_soup = soup(input)
for t in web_soup.findAll(text='David'):
    for item in t.parent.next_siblings:
        if isinstance(item, Tag):
            if 'class' in item.attrs and 'name' in item.attrs['class']:
                break
            print item

打印:

<td class="team">b<span class="team b"></span></td>
<td class="time">99'</td>

希望这就是你想要的。