我使用漂亮的汤从某些网页源代码中提取了一个结果集。然后,我尝试解析结果集的文本内容。一切正常,但是我也返回了我不需要的文本标题。
代码是这样的:
syn_divs = soup.findAll("div", {"class": "synopsis-container"})
for syn in syn_divs:
if syn.text != "Synopsis":
print syn.text
结果集内容如下:
<div class="synopsis-container">
<h2 class="section-title smaller">Synopsis</h2>
<p class="synopsis" itemprop="description">Two lovers - Muslim Ali (Adam Bakri) and Christian Nino (MarÃa Valverde) - have their marriage hopes thwarted by her Georgian parents and then an act carried out by Ali that forces him to flee a mountain village. After reuniting, the couple are later separated by the Soviet invasion of Azerbaijan when Ali opts to take on the Red Army. Christopher Hampton (Dangerous Liaisons) adapts the 1937 novel.</p>
</div>
返回的字符串是这样的
Synopsis
Two lovers - Muslim Ali (Adam Bakri) and Christian Nino (María Valverde) - have their marriage hopes thwarted by her Georgian parents and then an act carried out by Ali that forces him to flee a mountain village. After reuniting, the couple are later separated by the Soviet invasion of Azerbaijan when Ali opts to take on the Red Army. Christopher Hampton (Dangerous Liaisons) adapts the 1937 novel.
我要返回的电影简介,但我不希望标题Synopsis
也返回。我需要在我的代码中修改什么?
谢谢