Python HTML Parsing无法获得第二个Html标记

时间:2018-02-16 19:58:48

标签: python html-parsing

这是我的HTML代码。      

                                <div class="panel panel-default">
                                    <a role="button" data-toggle="collapse" data-parent="#accordion" href="#11571" aria-expanded="true" aria-controls="collapseOne">
                                    <div class="panel-heading" role="tab" id="headingOne">
                                        <h4 class="panel-title">
                                            On Birinci Kalkınma Planı Vatandaş Anketi
                                        </h4>
                                    </div>
                                        </a>
                                    <div id="11571" class="panel-collapse collapse" role="tabpanel" aria-labelledby="headingOne">
                                        <div class="panel-body">
                                            <p><strong></strong></p>
                                            <p>T.C. Kalkınma Bakanlığı'nın(@kalkinma), &uuml;lkemizin gelecek 5 yılı i&ccedil;inde ulaşmak istediği hedefleri ortaya koyacak On Birinci Kalkınma Planı(2019-2023) i&ccedil;in hazırladığı "On Birinci Kalkınma Planı Vatandaş Anketi"ne aşağıdaki linkten ulaşabilirsiniz. <br /><a class="Link" href="http://kbanket.kalkinma.gov.tr/index.php/471624/lang-tr">http://kbanket.kalkinma.gov.tr/index.php/471624/lang-tr</a>&nbsp;</p>
                                            <div class="clear"></div>
                                        </div>
                                    </div>

                                </div>

这是我的python代码。我想从上面的html代码中获取第二个h html标签的数据,但我无法找到解决方案。

data = page_soup.findAll("div", {"class": "panel panel-default"})

表示数据中的内容:

title = content.find("h4", {"class": "panel-title"})
detail=content.find("p")
print(title.text)
print(detail) #Showing <p><strong></strong></p>
print(detail.text) #showinh nothing
link = content.find("a",{"class":"pdf"})
if not link:
    continue
for link1 in link:
 print(link.get("href"))

这是我的输出: Output

0 个答案:

没有答案