在Python中从xml元素中获取数据时遇到问题

时间:2011-01-25 21:02:32

标签: python xml

我正在解析另一个程序的xml输出。

以下是xml片段的示例:

<result test="Passed" stamp="2011-01-25T12:40:46.166-08:00">
        <assertion>MultipleTestTool1</assertion>
        <comment>MultipleTestTool1 Passed</comment>
      </result>

我想从<comment>元素中获取数据。

这是我的代码段:

import xml.dom.minidom
mydata.cnodes = mydata.rnode.getElementsByTagName("comment")                        
    value = self.getResultCommentText( mydata.cnodes

    def getResultCommentText(self, nodelist):
            rc = []
            for node in nodelist:
                if node.nodeName == "comment":
                    if node.nodeType == node.TEXT_NODE:
                        rc.append(node.data)

        return ''.join(rc)

值总是为空,并且看起来nodeType始终是ELEMENT_NODE,因此.data不存在我是Python新手,这让我头疼。谁能告诉我我做错了什么?

3 个答案:

答案 0 :(得分:1)

尝试使用ElementTree而不是minidom:

>>> import xml.etree.cElementTree as et
>>> data = """
... <result test="Passed" stamp="2011-01-25T12:40:46.166-08:00">
...         <assertion>MultipleTestTool1</assertion>
...         <comment>MultipleTestTool1 Passed</comment>
...       </result>
... """
>>> root = et.fromstring(data)
>>> root.tag
'result'
>>> root[0].tag
'assertion'
>>> root[1].tag
'comment'
>>> root[1].text
'MultipleTestTool1 Passed'
>>> root.findtext('comment')
'MultipleTestTool1 Passed'
>>>

答案 1 :(得分:0)

你在这里:

>>> from lxml import etree
>>> result = """
... <result test="Passed" stamp="2011-01-25T12:40:46.166-08:00">
...         <assertion>MultipleTestTool1</assertion>
...         <comment>MultipleTestTool1 Passed</comment>
...       </result>
... """
>>> xml = etree.fromstring(result)
>>> xml.xpath('//comment/text()')
['MultipleTestTool1 Passed']
>>> 

答案 2 :(得分:0)

继续使用minidom,我修改了你的代码片段以指明所需的方法:

import xml.dom.minidom
mydata.cnodes = mydata.rnode.getElementsByTagName("comment")
value = self.getResultCommentText(mydata.cnodes)
  def getResultCommentText(self, nodelist):
    rc = []
    for node in nodelist:
      # Since the node list was created by getElementsByTagName("comment"),
      # all nodes in this list will be comment nodes.
      #
      # The text data required is a child of the current node
      for child in node.childNodes:
        # If the current node is a text node, append it's information
        if child.nodeType == child.TEXT_NODE:
          rc.append(child.data)
  return ''.join(rc)

基本上,正在发生的是所需的文本数据包含在作为注释节点的子节点的文本节点中。首先,必须检索节点,然后才能检索数据。