lxml,如何使用CRLF获取原始属性值

时间:2014-01-09 13:10:30

标签: python attributes whitespace lxml

这是我的小python程序:

import xml.etree.ElementTree as etree
tree = etree.parse('test.xml')
root = tree.getroot()
print root.attrib['a']

这是test.xml文件:

<?xml version="1.0" encoding="utf-8" ?>
<root a="line one
line two
line three">
</root>

当我跑步时,我得到:

line one line two line three

虽然我期待:

line one
line two
line three

如何实现预期的行为?

1 个答案:

答案 0 :(得分:0)

您可以尝试使用BeautifulSoup这是一个很棒的Python库来解析半破碎的xml / html。 E.g:

from bs4 import BeautifulSoup
xml = open('test.xml').read()
soup = BeautifulSoup(xml)
print soup.root.attrs['a']

打印:

line one
line two
line three