Question

我有一封html电子邮件，我使用漂亮的汤提取文本，然后我想删除任何前导空格，但无论我尝试多少次textwrap.dedent或string.strip（）它都没有从某些行中删除空格。我做了一个print repr（string），输出就是这个。

\r\n   content

意味着\ r \ n与行上的内容之间存在实际的空格，即使我使用strip或其他任何内容删除它们，也会继续有空格。我该如何处理？

现在代码：

no_html = BeautifulSoup(message).get_text()
final_message = no_html.strip()
print final_message

Answer 1

在这个例子中，看起来split（）对我有用。

添加以下代码：

newmsg = newmsg + "\n" + ' '.join(line.split())

Answer 2

这对我有用

no_html.rstrip().strip()

从我的代码中删除了所有'/ t / n'分隔符