I have a web page I'm scraping and parsing, using Beautiful Soup. On this webpage there are several refernces to other sources. They look a lot like this:`
Shakespeare wrote good, such as in <a href="link_to_source">Romeo and Juliet, IV:ii</a>.
What I'd like to have is:
Shakespeare wrote good, such as in (Romeo and Juliet, IV:ii).
Bare in mind, that this is a very long webpage with many lines and I need to combine all of them, so just modifying one "a" tag won't work for me, I need to modify all "a" tags on the page.
This is something I've tried already:
piska_ps = url_to_soup('https://he.wikisource.org'+a['href']).find_all('p')
p_box = []
for p in piska_ps:
if p.a:
for a_link in p.a:
a_link.string = "("+a_link.string+")"
答案 0 :(得分:0)
您可以使用replace_with
替换标记:
piska_ps = url_to_soup('https://he.wikisource.org'+a['href']).find_all('p')
for p in piska_ps:
for a in p.find_all('a'):
a.replace_with("(" + a.string + ")")
答案 1 :(得分:0)
首先,p.a
等于p.find('a')
,它返回一个标记,你不能迭代它。
piska_ps = url_to_soup('https://he.wikisource.org'+a['href']).find_all('p')
p_box = []
for p in piska_ps:
if p.a:
p.a.string = "("+p.a.string+")"