在Beautifulsoup中添加标签,文本和链接的组合的简便方法?

时间:2018-11-20 19:19:53

标签: python beautifulsoup

我正在寻找从网站上删除参考文献并将其添加到我现有的参考文献列表中的方法,到目前为止,我已经成功地进行了搜索,但是我似乎无法做最后一步。现有参考资料。让我举例说明:

我设法抓取的参考:

scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

我需要在该引用之前添加以下标记:

<br>3. <b>Image: </b>

然后会做到这一点:

formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

然后最后将格式化的引用添加到我现有的引用列表中:

existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>

我尝试过:

for p in soup.find_all("p", {"class":"references"}):
    print(p.append('<br>3. <b>Image: </b>' + scraped_ref))

但是结果我丢失了所有标签信息:

<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>
2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>

我该怎么办?

0 个答案:

没有答案