我正在使用libxml2编写我的第一个Python脚本来从XML文件中检索数据。该文件如下所示:
<myGroups1>
<myGrpContents name="ABC" help="abc_help">
<myGrpKeyword name="abc1" help="help1"/>
<myGrpKeyword name="abc2" help="help2"/>
<myGrpKeyword name="abc3" help="help3"/>
</myGrpContents>
</myGroups1>
文件中有许多类似的组。我的目的是获取属性“name”和“help”,并将它们以不同的格式放入另一个文件中。但是我只能使用以下代码检索到myGroups1元素。
doc = libxml2.parseFile(cmmfilename)
root2 = doc.children
child = root2.children
while child is not None:
if not child.isBlankNode():
if child.type == "element":
print "\t Element ", child.name, " with ", child.lsCountNode(), "child(ren)"
print "\t and content ", repr(child.content)
child = child.next
如何更深入地迭代元素并获取属性?对此的任何帮助都将深表感谢。
答案 0 :(得分:0)
没有使用过libxml2,但潜入了案例并发现了这个,
尝试其中之一,
if child.type == "element":
if child.name == "myGrpKeyword":
print child.prop('name')
print child.prop('help')
或
if child.type == "element":
if child.name == "myGrpKeyword":
for property in child.properties:
if property.type=='attribute':
# check what is the attribute
if property.name == 'name':
print property.content
if property.name == 'help':
print property.content
参考http://ukchill.com/technology/getting-started-with-libxml2-and-python-part-1/
更新
尝试递归函数
def explore(child):
while child is not None:
if not child.isBlankNode():
if child.type == "element":
print element.prop('name')
print element.prop('help')
explore(child.children)
child = child.next
doc = libxml2.parseFile(cmmfilename)
root2 = doc.children
child = root2.children
explore(child)
答案 1 :(得分:0)
python. how to get attribute value with libxml2可能是您正在寻找的答案。
当遇到这样的问题时,当我因某些原因而不想阅读文档时,以这种方式交互式地探索图书馆会很有帮助 - 我建议你使用交互式python repl(我喜欢bpython)来试试这个。这是我提出解决方案的会议:
>>> import libxml2
>>> xml = """<myGroups1>
... <myGrpContents name="ABC" help="abc_help">
... <myGrpKeyword name="abc1" help="help1"/>
... <myGrpKeyword name="abc2" help="help2"/>
... <myGrpKeyword name="abc3" help="help3"/>
... </myGrpContents>
... </myGroups1>"""
>>> tree = libxml2.parseMemory(xml, len(xml)) # I found this method by looking through `dir(libxml2)`
>>> tree.children
<xmlNode (myGroups1) object at 0x10aba33b0>
>>> a = tree.children
>>> a
<xmlNode (myGroups1) object at 0x10a919ea8>
>>> a.children
<xmlNode (text) object at 0x10ab24368>
>>> a.properties
>>> b = a.children
>>> b.children
>>> b.properties
>>> b.next
<xmlNode (myGrpContents) object at 0x10a921290>
>>> b.next.content
'\n \n \n \n'
>>> b.next.next.content
'\n'
>>> b.next.next.next.content
Traceback (most recent call last):
File "<input>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'content'
>>> b.next.next.next
>>> b.next.properties
<xmlAttr (name) object at 0x10aba32d8>
>>> b.next.properties.children
<xmlNode (text) object at 0x10ab40f38>
>>> b.next.properties.children.content
'ABC'
>>> b.next.properties.children.name
'text'
>>> b.next.properties.next
<xmlAttr (help) object at 0x10ab40fc8>
>>> b.next.properties.next.name
'help'
>>> b.next.properties.next.content
'abc_help'
>>> list(tree)
[<xmlDoc (None) object at 0x10a921248>, <xmlNode (myGroups1) object at 0x10aba32d8>, <xmlNode (text) object at 0x10aba3878>, <xmlNode (myGrpContents) object at 0x10aba3d88>, <xmlNode (text) object at 0x10aba3950>, <xmlNode (myGrpKeyword) object at 0x10aba3758>, <xmlNode (text) object at 0x10aba3320>, <xmlNode (myGrpKeyword) object at 0x10aba3f38>, <xmlNode (text) object at 0x10aba3560>, <xmlNode (myGrpKeyword) object at 0x10aba3998>, <xmlNode (text) object at 0x10aba33f8>, <xmlNode (text) object at 0x10aba38c0>]
>>> good = list(tree)[5]
>>> good.properties
<xmlAttr (name) object at 0x10aba35f0>
>>> good.prop('name')
'abc1'
>>> good.prop('help')
'help1'
>>> good.prop('whoops')
>>> good.hasProp('whoops')
>>> good.hasProp('name')
<xmlAttr (name) object at 0x10ab40ef0>
>>> good.hasProp('name').content
'abc1'
>>> for thing in tree:
... if thing.hasProp('name') and thing.hasProp('help'):
... print thing.prop('name'), thing.prop('help')
...
...
...
ABC abc_help
abc1 help1
abc2 help2
abc3 help3
因为它是bpython,我有点作弊 - 这是一个倒带键,所以我输错了比这更多,但不然这很接近。