Question

我想知道如何删除BeautifulSoup中美化自动创建的编码。例如：

tree='''<A attribute1="1" attribute2="2">
 <B>
  <C/>
 </B>
</A>'''
from collections import defaultdict
from bs4 import BeautifulSoup as Soup
root = Soup(tree, 'lxml-xml')
print root.prettify().replace('\n', '')

输出看起来像

<?xml version="1.0" encoding="utf-8"?><A attribute1="1" attribute2="2"> <B>  <C/> </B></A>

我想简单地说：

<A attribute1="1" attribute2="2"> <B>  <C/> </B></A>

Answer 1

有几种方法可以解决这个问题：

首先，调用root.decode_contents()，它会为您提供一个非美化内容的输出。

或者分别对内容中的每个块进行美化，然后加入它们。像这样：'\n'.join(x.prettify() for x in root.contents)。

如何从美丽的汤中删除xml编码？

1 个答案: