<div class="d-flex flex-column flex-sm-row justify-content-sm-start align-items-sm-center justify-content-start align-items-center card box-shadow RankItem">
<div class="d-flex flex-column justify-content-center align-items-center LeftSection">
<div class="rank RankNumber"><span>#</span>10</div>
<div class="score">SCORE 7.597</div>
<span class="ChgUp" style="display:none !important;"><i aria-hidden="" class="fas fa-arrow-circle-up" title="up"></i></span>
<span class="ChgDown" style="display:none !important;"><i aria-hidden="" class="fas fa-arrow-circle-down" title="down"></i></span>
<span class="d-flex flex-row align-items-center ChgNeutral" style="display:none !important;">
<i aria-hidden="" class="fa-stack fa-2x" title="no change">
<i class="fas fa-circle fa-stack-2x"></i>
<i class="fal fa-arrows-h fa-stack-1x fa-inverse"></i>
</i>
</span>
<span class="d-flex flex-row align-items-center">
<i aria-hidden="" class="fa-stack fa-2x" title="no change">
<i class="fas fa-circle fa-stack-2x"></i>
<i class="fal fa-arrows-h fa-stack-1x fa-inverse"></i>
</i>
2019 Rank 10 </span>
</div>
我想使用漂亮的汤从该页面来源中删除“ 2019”。 我只想要数字2019.请任何人帮助
答案 0 :(得分:1)
以下是我检查完先前的问题并亲自找到您要通过此链接https://www.vault.com/best-companies-to-work-for/law/top-100-law-firms-rankings/year/2020达到的目标后的答案。
from bs4 import BeautifulSoup
html = """
<span class="d-flex flex-row align-items-center">
<i class="fa-stack fa-2x" aria-hidden="" title="no change">
<i class="fas fa-circle fa-stack-2x"></i>
<i class="fal fa-arrows-h fa-stack-1x fa-inverse"></i>
</i>
2019 Rank 10 </span>
"""
soup = BeautifulSoup(html, 'html.parser')
for item in soup.findAll('span', attrs={'class': 'd-flex flex-row align-items-center'}):
item = item.text
print(item.strip()[0:4])
输出:
2019