输入数据如下所示,其中具有多个ul
标签,可以刮取python漂亮的汤。
<div class="column one-second"><p></p> <ul> <li>Commercial automobile</li> <li>Excess liability</li> <li>General liability</li> <li>Inland marine (cargo)</li> </ul> <p></p></div> <div class="column one-second"><p></p> <ul> <li>Professional Liability</li> <li>Property</li> <li>Workers’ compensation</li> </ul> <p></p></div>
To get the listed items from `ul` tag using beautiful soup library, I tried this but did not work:
amusements_soup.find_all('li', attrs={'id': 'menu-item-16'})
amusements_soup.find_all('div',{'class':'column one-second'})
ul = amusements_soup.find("h2", text="Services & Solutions").find_next_sibling("ul")
expected output :
> Commercial automobile
>
> Excess liability
>
> General liability
>
> Inland marine
>
> Professional Liability
>
> Workers’ compensation
答案 0 :(得分:0)
假设amusements_soup
包含您提到的HTML,它应该可以工作:
from bs4 import BeautifulSoup
page = '<div class="column one-second"><p></p> <ul> <li>Commercial automobile</li> <li>Excess liability</li> <li>General liability</li> <li>Inland marine (cargo)</li> </ul> <p></p></div> <div class="column one-second"><p></p> <ul> <li>Professional Liability</li> <li>Property</li> <li>Workers’ compensation</li> </ul> <p></p></div>'
amusements_soup = BeautifulSoup(page,"html.parser")
for item in amusements_soup.findAll('div',{'class':'column one-second'}):
sub_items = item.findAll('li')
for sub_item in sub_items:
print(sub_item.text)
输出:
Commercial automobile
Excess liability
General liability
Inland marine (cargo)
Professional Liability
Property
Workers’ compensation
如果这对您不起作用,则必须检查amusements_soup
确实是您认为的那样
答案 1 :(得分:0)
与类和类型选择器以及使用列表理解的后代组合器相同的东西
results = [item.text for item in amusements_soup.select('.one-second li')]