检索特定的文本行

时间:2016-09-03 20:09:06

标签: python beautifulsoup

<li class="b-list__box-list-item b-list__box-list-item_type_block">
      <i class="b-list__box-item-title b-list__box-item-title_type_width">
        Height:
      </i>
      6' 2"
</li>

对于上述内容,我只想检索6'2“并忽略”height“。我的代码

stat_one = stat_table_one.find_all("li", {"class": "b-list__box-list-item b-list__box-list-item_type_block"})

for li in stat_one:
    print li.get_text()

此代码拉出“高度”和6'2“。有没有办法获得6'2”?

1 个答案:

答案 0 :(得分:1)

In [1]: from bs4 import BeautifulSoup

In [2]: h = """<li class="b-list__box-list-item b-list__box-list-item_type_block">
   ...:       <i class="b-list__box-item-title b-list__box-item-title_type_width">
   ...:         Height:
   ...:       </i>
   ...:       6' 2"
   ...: </li>
   ...: """    
In [3]: soup = BeautifulSoup(h,"html.parser")

In [4]: "".join(soup.find("li").find_all(text=True, recursive=False)).strip()

Out[4]: u'6\' 2"'

你不想要孩子的文字,所以不要递归。