example i have string
i need to replace 'game' word except in <a>
tag
"<p class='glossary_hover'> cricket is good game <li> hello game </li> <a href='https://game.in'> game of thrones </a> need to replace game word </p>"
result is
"<p class='glossary_hover'> cricket is good game2 <li> hello game2 </li> <a href='https://game.in'> game of thrones </a> need to replace game2 word </p>"
where i replaced game to game2 word
答案 0 :(得分:0)
使用re:
#!/usr/bin/env python3
import re
s = "<p class='glossary_hover'> cricket is good game <li> hello game </li> <a href='https://game.in'> game of thrones </a> need to replace game word </p>"
mapping = {}
for a in re.findall("<a[^>]*>.*</a>", s):
mapping[a.replace("game","game2")] = a
s = s.replace("game", "game2")
for a_game2, a_original in mapping.items():
s = s.replace(a_game2, a_original)
print(s)
使用bs4:
#!/usr/bin/env python3
from bs4 import BeautifulSoup
s = "<p class='glossary_hover'> cricket is good game <li> hello game </li> <a href='https://game.in'> game of thrones </a> need to replace game word </p>"
soup = BeautifulSoup(s, "html.parser")
mapping = {}
for a_tag in soup.find_all("a"):
a = str(a_tag).replace("\"","'") # bs4 replaces single quotes with doubles
mapping[a.replace("game","game2")] = a
s = s.replace("game", "game2")
for a_game2, a_original in mapping.items():
s = s.replace(a_game2, a_original)
print(s)
说明:
为了示例,创建一个名为mapping
的字典。.我们将所有内容存储在其中的a
标记内。.键将game
替换为game2
。原始字符串。
这使我们可以在整个字符串上将game
替换为game2
,然后运行另一个替换操作,以放回先前在a
标记中找到的所有内容。
两个脚本的结果相同:
<p class='glossary_hover'> cricket is good game2 <li> hello game2 </li> <a href='https://game.in'> game of thrones </a> need to replace game2 word </p>