我正在尝试删除< li>
标记
我的HTML
<ul id="MenuGreyBar">
<li style="left: 0px;">
<a href="#" class="bgGrey"> </a>
</li>
</ul>
<ul>
<li>
<a href="about_us.html" class="bgLightBlue">About Us</a>
</li>
<li >
<a href="Help_Support.html" class="bgMuddyGreen">Help & Support</a>
</li>
<li >
<a href="Law_Info.html" class="bgGreen">Law & Information</a>
</li>
<!-- ... There are a few more. -->
</ul>
我需要删除<li>
标记
我得到的代码
答案 0 :(得分:3)
你是以错误的方式去做的;只需搜索li
代码并在其上调用.decompose()
:
soup = BeautifulSoup(input_document)
for li in soup.find_all('li'):
li.decompose()
演示:
>>> from bs4 import BeautifulSoup
>>> input_document = '''\
... <ul id="MenuGreyBar">
... <li style="left: 0px;">
... <a href="#" class="bgGrey"> </a>
... </li>
... </ul>
...
... <ul>
... <li>
... <a href="about_us.html" class="bgLightBlue">About Us</a>
... </li>
... <li >
... <a href="Help_Support.html" class="bgMuddyGreen">Help & Support</a>
... </li>
... <li >
... <a href="Law_Info.html" class="bgGreen">Law & Information</a>
... </li>
... <!-- ... There are a few more. -->
... </ul>
... '''
>>> soup = BeautifulSoup(input_document)
>>> for li in soup.find_all('li'):
... li.decompose()
...
>>> print soup
<html><head></head><body><ul id="MenuGreyBar">
</ul>
<ul>
<!-- ... There are a few more. -->
</ul>
</body></html>