Question

我有与DTD匹配的XML文件

<!ELEMENT root (node, notinteresting>
<!ELEMENT node (node*)>
<!ELEMENT notinteresting (#PCDATA)>

我想要检索这样一个文件的最顶层节点（在XPath中：/root/node）及其下面的所有内容，忽略notinteresting位。我怎么能在几行Python中做到这一点？速度/内存消耗不是问题。我想要的东西是print。

Answer 1

您可以使用elementtree API，具体取决于您使用的版本，导入可能略有不同。你需要版本＆gt; = python 2.7

from xml.etree.ElementTree import ElementTree
tree = ElementTree()
tree.parse("yourdoc.xml")
roottree = tree.getroot()

然后它让你有可能做类似的事情。

for c in roottree.getchildren():

请注意，如果您的输入只有一个字符串，而不是解析，则可以使用fromstring（）

更新：如果“root”是xml文件的根元素

，也可以使用

roottree = tree.find('root')

Answer 2

看看2个模块，

两者都可以让你做你想做的事，虽然方式略有不同。