<li class="b-list__box-list-item b-list__box-list-item_type_block">
<i class="b-list__box-item-title b-list__box-item-title_type_width">
Height:
</i>
6' 2"
</li>
对于上述内容,我只想检索6'2“并忽略”height“。我的代码
stat_one = stat_table_one.find_all("li", {"class": "b-list__box-list-item b-list__box-list-item_type_block"})
for li in stat_one:
print li.get_text()
此代码拉出“高度”和6'2“。有没有办法获得6'2”?
答案 0 :(得分:1)
In [1]: from bs4 import BeautifulSoup
In [2]: h = """<li class="b-list__box-list-item b-list__box-list-item_type_block">
...: <i class="b-list__box-item-title b-list__box-item-title_type_width">
...: Height:
...: </i>
...: 6' 2"
...: </li>
...: """
In [3]: soup = BeautifulSoup(h,"html.parser")
In [4]: "".join(soup.find("li").find_all(text=True, recursive=False)).strip()
Out[4]: u'6\' 2"'
你不想要孩子的文字,所以不要递归。