from bs4 import BeautifulSoup
page = """<span id="something">useless</span>
<span id="">some text</span>
<span id="different">useless</span>"""
soup = BeautifulSoup(page)
我如何才能获得some text
?使用soup.find_all('span', {'id': ""})
查找所有内容。
答案 0 :(得分:1)
您有两种选择:
使用自定义过滤器;传入一个函数,系统会要求它返回True
或False
元素:
soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
使用具有完全属性匹配的CSS selector:
soup.select('span[id=""]')
演示:
>>> from bs4 import BeautifulSoup
>>> page = """<span id="something">useless</span>
... <span id="">some text</span>
... <span id="different">useless</span>"""
>>> soup = BeautifulSoup(page)
>>> soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
[<span id="">some text</span>]
>>> soup.select('span[id=""]')
[<span id="">some text</span>]