如何提取“ br /”标签后的文本? 我只输入那些文字,而不是“ strong”标记中的任何文字。
<p><strong>A title</strong><br/>
Text I want which also
includes linebreaks.</p>
尝试过类似的代码
text_content = paragraph.get_text(separator='strong/').strip()
但这也将在“ strong”标记中包含文本。
如果不清楚,“ paragraph”变量是bs4.element.Tag。
任何帮助表示赞赏!
答案 0 :(得分:1)
如果您有public class DefaultFooFactory: IFooFactory{
public IFoo create(){return new DefaultFoo();}
}
标记,则在其中找到<p>
并使用<br>
.next_siblings
输出:
import bs4
html = '''<p><strong>A title</strong><br/>
Text I want which also
includes linebreaks.</p>'''
soup = bs4.BeautifulSoup(html, 'html.parser')
paragraph = soup.find('p')
text_wanted = ''.join(paragraph.find('br').next_siblings)
print (text_wanted)
答案 1 :(得分:1)
找到<br>
标记并使用next_element
from bs4 import BeautifulSoup
data='''<p><strong>A title</strong><br/>
Text I want which also
includes linebreaks.</p>'''
soup=BeautifulSoup(data,'html.parser')
item=soup.find('p').find('br').next_element
print(item)