Question

一个例子：

<p> Hello</p>
<div>hgello</div>
<pre>
   code
    code
<pre>

变成了类似的东西：

<p> Hello</p><div>hgello</div><pre>
    code
     code
<pre>

如何在python中执行此操作？我也大量使用＆lt;预＆GT;标签所以用''替换所有'\ n'不是一种选择。

最好的方法是什么？

Answer 1

您可以使用re.sub(">\s*<","><","[here your html string]")。

可能是string.replace(">\n",">")，即查找一个封闭的括号和换行符并删除换行符。

Answer 2

我会选择使用python正则表达式：

string.replace(">\s+<","><")

'\ s'找到任何空格字符，而'+'显示匹配一个或多个空白字符。这消除了更换替换的可能性

<pre>
    code
     code
<pre>

带

<pre><pre>

可以找到有关正则表达式的更多信息here，here和here。