我想找到所有带有class =&#34的div标签;将某些数字发布到某些文本" 有多个div标签,例如。
<div class="post-3562 some text">
<div class="post-some text">
<div class="post-some text">
<div class="post-1324 some text">
<div class="post-4540 some text">
<div class="post-some text">
<div class="post-1122 some text">
我只想获得带有class =&#34; div-some number&#34;
的div标签目前我写的是:
allPostsDiv = soup.find_all("div", class_= "post")
有没有办法实现我想做的事情?可能使用正则表达式会有帮助吗? 任何帮助将不胜感激。
答案 0 :(得分:3)
您可以将正则表达式作为class_
参数的值传递,如下所示:
soup.find_all(name='div', class_=re.compile(r'^post-\d+$'))
完整计划:
from bs4 import BeautifulSoup
import re
soup = BeautifulSoup('''
<root>
<div class="post-3562 some text"/>
<xdiv class="post-9999 some text"/>
<div class="post-some text"/>
<div class="post-some text"/>
<div class="post-1324some text"/>
<div class="some post-4540 text"/>
<div class="post-some text"/>
<div class="some text post-1122"/>
</root>''', 'html.parser')
for div in soup.find_all(name='div', class_=re.compile(r'^post-\d+$')):
print div
结果:
<div class="post-3562 some text"></div>
<div class="some post-4540 text"></div>
<div class="some text post-1122"></div>
答案 1 :(得分:-1)