Question

我编写了以下python代码来解析带有lxml的XML文件。我很困惑为什么在这种情况下返回内存地址而不是实际输出height。

import pdb
from pprint import pprint
from lxml import etree

def main():
    parser = etree.XMLParser(strip_cdata = False)
    tree = etree.parse("xml.xml", parser)
    string =  etree.tostring(tree.getroot())
    content = etree.fromstring(string)
    bodies = content.findall('Body')
    all_data = []
    for body in bodies:
        row  = {}
        height = body.findall('height')

        row['height'] = height
        all_data.append(row)
    print all_data
    pdb.set_trace()

我得到的输出是：

[{'height': [<Element height at 0x11ac80830>]}]

而我希望高度为178。

xml文件中的数据是：

<Bodies>
  <Body>
    <name><![CDATA[abc]]></name>
       <body_name><![CDATA[asdjakhdas da sdasda sd]]></body_name>
       <final_height><![CDATA[199]]></final_height>
       <categories><![CDATA[a / b / c]]></categories>
       <hand><![CDATA[asdkj]]></hand>
        <height>178</height>
  </Body>
<Bodies>

Answer 1

如果您不想将高度元素放入变量中，而是将其包含在文本字段中：

height = body.find('height')
row['height'] = float(height.text) if height else None

请注意，使用find而非findall - findall会返回一个列表，而听起来您只期望一个值（并且不需要）递减）。

Python：lxml findall返回对象，而不是元素文本中的值

1 个答案: