我尝试检测HTML标签之间的空格。
就我而言:
[...]a lot of code[...]<strong>text with spaces. and dots</strong>[...]a lot of code[...]
我的目标是获得:
[...]a lot of code[...]<strong>textwithspaces.anddots</strong>[...]a lot of code[...]
就是这样。我试着像(?<=<strong>.*)\s(?=</strong>)
这样的东西,但它没有给我我要删除的空格。重要的是,我只删除这些空格而不删除其他可能正在破坏代码的空格
答案 0 :(得分:1)
import re
x="<strong>text with spaces</strong>"
x=pattern=re.sub(r"\s+","",x)
答案 1 :(得分:0)
根据要求,这是我的解决方案:
set(#scrape, $replace regular expression(#scrape, "</strong>\\s|\\s</strong>|</strong>|<strong>\\s|\\s<strong>|<strong>", ""), "Global")
set(#scrape, $replace regular expression(#scrape, "(?<=<strong>.*?)\\s(?=.*?</strong>)", ""), "Global")
set(#scrape, $replace regular expression(#scrape, ""\\s|\\s"|"", ""), "Global")