Question

我已经完成了下面的编码，但是不知道为什么它会出空的数据帧。

     <Report xmlns="urn:crystal-reports:schemas:report-detail"  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:crystal-reports:schemas:report-detail http://www.businessobjects.com/products/xml/CR2008Schema.xsd">
        <Details Level="1">
        <Field Name='ReportNo'><Value>90</Value>

ns = {"urn:crystal-reports:schemas:report-detail#"}


def test(xml_file, df_cols):
    global df
    xtree = et.parse(xml_file)
    xroot = xtree.getroot()
    out_xml = pd.DataFrame(columns=df_cols)

    for node in xroot.findall("urn：Group[1]/Details/Field", ns):
        name = node.attrib.get("Name")
        value = node.find("Value").text

Answer 1

您粘贴的XML代码段与您的查询不符，缺少您要查找的<Group>元素。

无论哪种方式，您都需要

具有正确的名称空间 map （字典）–您当前有一个包含一个条目的集合
需要使用真实冒号:而不是全角冒号：分隔名称空间别名
在查询的 each 元素上具有名称空间，以及Value子节点查询。

在这里，我选择r（“报告”的缩写）作为urn:crystal-reports:schemas:report-detail的别名。如果您不想使用别名，也可以使用长语法{urn:crystal-reports:schemas:report-detail}Group等，在这种情况下，您不需要名称空间映射。

所有已解决的问题，我们都会得到

import xml.etree.ElementTree as et

data = """<?xml version="1.0"?>
<Report xmlns="urn:crystal-reports:schemas:report-detail" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:crystal-reports:schemas:report-detail http://www.businessobjects.com/products/xml/CR2008Schema.xsd">
  <Group>
      <Details Level="1">
        <Field Name="ReportNo"><Value>90</Value></Field>
        <Field Name="Other"><Value>644</Value></Field>
      </Details>
  </Group>
</Report>
"""

nsmap = {"r": "urn:crystal-reports:schemas:report-detail"}
xroot = et.XML(data)  # could read from file here

for node in xroot.findall("r:Group/r:Details/r:Field", nsmap):
    name = node.attrib.get("Name")
    value = node.find("r:Value", nsmap).text
    print(name, value)

这里的输出是

ReportNo 90
Other 644

–将其插入数据框作为练习留给读者。

如何解析带有名称空间的xml文件？

1 个答案: