Question

我尝试检测HTML标签之间的空格。

就我而言：

[...]a lot of code[...]<strong>text with spaces. and dots</strong>[...]a lot of code[...]

我的目标是获得：

[...]a lot of code[...]<strong>textwithspaces.anddots</strong>[...]a lot of code[...]

就是这样。我试着像(?<=<strong>.*)\s(?=</strong>)这样的东西，但它没有给我我要删除的空格。重要的是，我只删除这些空格而不删除其他可能正在破坏代码的空格

Answer 1

import re
x="<strong>text with spaces</strong>"
x=pattern=re.sub(r"\s+","",x)

Answer 2

根据要求，这是我的解决方案：

set(#scrape, $replace regular expression(#scrape, "</strong>\\s|\\s</strong>|</strong>|<strong>\\s|\\s<strong>|<strong>", ""), "Global")
set(#scrape, $replace regular expression(#scrape, "(?<=<strong>.*?)\\s(?=.*?</strong>)", ""), "Global")
set(#scrape, $replace regular expression(#scrape, "&quot;\\s|\\s&quot;|&quot;", ""), "Global")

仅使用正则表达式删除标记之间的每个空格

2 个答案: