我在使用Python和BS4编写代码时遇到了问题。
假设有下一段:
<p id="paragraph">
Here is the paragraph <a href="/" id="url">Here an url</a> the paragraph continues.
</p>
我接受id并使用replace_with
替换字符串(在P和A标记中)。但在这种情况下,结果如下:
Here is the paragraph the paragraph continues. Here an url
结构不受尊重。什么是正确的方法?
添加一些代码:
page = open('file.html')
soupPage = BeautifulSoup(page)
findId = soupPage.find(id='nameOfId')
findId.replace_with('NewString')
答案 0 :(得分:1)
试试这个:
from bs4 import BeautifulSoup
page = open('file.html')
soupPage = BeautifulSoup(page)
findId = soupPage.find(id='url')
findId.contents[0].replace_with('NewString')
print soupPage
打印:
<html><body><p id="paragraph">
Here is the paragraph <a href="/" id="url">NewString</a> the paragraph continues.
</p></body></html>
希望这是你想要的。