你好,我试图从网站上获取一些数据,然后我应该在页面中找到我上一次使用的最后一个元素,并选择第一个元素的Previews元素,请检查我的代码,我将解释更完整在我的示例中:
这是示例HTML代码:
<div class="post" id="7517049">
<div class="p-head">
<div class="p-c p-c-time"><span class="p-time" data="1554741054" title="2019-04-08 @ 21:00:54 ( Your Time )"><span class="t-n-m">45</span> <span class="t-u">mins</span></span>
</div>
<div class="p-c p-c-cat"><span class="p-cat c-5 c-7 "><a href="http://predb.me?cats=tv" class="c-adult">TV</a><a href="http://predb.me?cats=tv-hd" class="c-child">HD</a></span></div>
<div class="p-c p-c-title">
<h2><a class="p-title" href="http://predb.me?post=7517049">The.Repair.Shop.S04E02.720p.WEBRip.x264-LiGATE</a></h2>
<a rel="nofollow" href="http://predb.me?post=7517049" class="tb tb-perma" title="Visit the permanent page for this release."></a>
</div>
</div>
</div>
<div class="post" id="7517048">
<div class="p-head">
<div class="p-c p-c-time"><span class="p-time" data="1554740951" title="2019-04-08 @ 20:59:11 ( Your Time )"><span class="t-n-m">47</span> <span class="t-u">mins</span></span>
</div>
<div class="p-c p-c-cat"><span class="p-cat c-24 c-25 "><a href="http://predb.me?cats=books" class="c-adult">Books</a><a href="http://predb.me?cats=books-ebooks" class="c-child">eBooks</a></span></div>
<div class="p-c p-c-title">
<h2><a class="p-title" href="http://predb.me?post=7517048">John.Bell.Young.Puccini.A.Listeners.Guide.Dover.Books.on.Music.and.Music.History.2016.RETAiL.ePub.eBook-VENTOLiN</a></h2>
<a rel="nofollow" href="http://predb.me?post=7517048" class="tb tb-perma" title="Visit the permanent page for this release."></a>
</div>
</div>
</div>
<div class="post" id="7517047">
<div class="p-head">
<div class="p-c p-c-time"><span class="p-time" data="1554740927" title="2019-04-08 @ 20:58:47 ( Your Time )"><span class="t-n-m">48</span> <span class="t-u">mins</span></span>
</div>
<div class="p-c p-c-cat"><span class="p-cat c-5 c-6 "><a href="http://predb.me?cats=tv" class="c-adult">TV</a><a href="http://predb.me?cats=tv-sd" class="c-child">SD</a></span></div>
<div class="p-c p-c-title">
<h2><a class="p-title" href="http://predb.me?post=7517047">The.Repair.Shop.S04E01.WEB.h264-LiGATE</a></h2>
<a rel="nofollow" href="http://predb.me?post=7517047" class="tb tb-perma" title="Visit the permanent page for this release."></a>
</div>
</div>
</div>
在顶部,我们有3个主要div,其中包含另一个div,例如,我在第3个主要div中给出了<a>
标签的值,值为The.Repair.Shop.S04E01.WEB.h264-LiGATE
,而我想下一次我的脚本重新加载了页面,然后在页面中找到The.Repair.Shop.S04E01.WEB.h264-LiGATE
,并通过网站实际值通过电视值选择了具有<span>
和<a>
的上一个div,我需要选择上一个元素通过电视价值拥有<a>
。在示例html中,第1个div具有TV值,而第2个div没有TV值。有这个主意吗?
我尝试过的python代码:
my_soup = Wsoup(my_driver, "html.parser")
last_rls = input("Please Insert starter Release From Predb.me ::::")
previous_rls = my_soup.find("a", text=last_rls)
print(previous_rls)
Entry= previous_rls.parent.parent.parent.parent
previous_rls_parent = Entry.find_previous_sibling("div",{"class":"post"})
print(previous_rls_parent)
python代码可以显示先前的元素,但是我需要通过电视值显示包含<a>
标签的先前的elemenet
答案 0 :(得分:0)
如果您要显示所搜索帖子的3个<div>
元素中的文本,则可以尝试以下方法:
from bs4 import BeautifulSoup
search = "The.Repair.Shop.S04E01.WEB.h264-LiGATE"
soup = BeautifulSoup(my_driver, "html.parser")
rls = soup.find("a", text=search)
div_parent = rls.find_previous('div', class_='p-head')
for div in div_parent.find_all('div'):
print(div.get_text(strip=True))
这将显示以下3个项目:
48mins
TVSD
The.Repair.Shop.S04E01.WEB.h264-LiGATE