我正在搜索一个特定的字符串,该字符串应该与标记的文本值完全匹配。如何仅使用术语'结果'进行搜索?并标记' h4'回到我身边?
soup = BeautifulSoup('<table><tbody><tr><td class="fulltext-body-paragraph"><a name="44"></a><div class="fulltext-LEVEL1"><h4>RESULTS</h4></div></td></tr></tbody></table>')
soup.find(lambda el: el.text == 'RESULTS').name
Out: 'html' # I would like it to return 'h4'
答案 0 :(得分:0)
这(https://stackoverflow.com/a/13349041/7573286)可以解决您的问题吗?
from bs4 import BeautifulSoup
from pprint import pprint
import re
html_text = """
<h2>this is cool #12345678901</h2>
<h2>this is nothing</h2>
<h2>this is interesting #126666678901</h2>
<h2>this is blah #124445678901</h2>
"""
soup = BeautifulSoup(html_text)
# Even though the OP was not looking for 'cool', it's more understandable to work with item zero.
pattern = re.compile(r'cool')
pprint(soup.find(text=pattern).__dict__)
#>> {'next': u'\n',
#>> 'nextSibling': None,
#>> 'parent': <h2>this is cool #12345678901</h2>,
#>> 'previous': <h2>this is cool #12345678901</h2>,
#>> 'previousSibling': None}
print soup.find('h2')
#>> <h2>this is cool #12345678901</h2>
print soup.find('h2', text=pattern)
#>> this is cool #12345678901
print soup.find('h2', text=pattern).parent
#>> <h2>this is cool #12345678901</h2>
print soup.find('h2', text=pattern) == soup.find('h2')
#>> False
print soup.find('h2', text=pattern) == soup.find('h2').text
#>> True
print soup.find('h2', text=pattern).parent == soup.find('h2')
#>> True