Question

我有一个XML文件的玩具示例。我有成千上万的这些。我很难解析这个文件。

查看第二行的文字。我的所有原始文件都包含此文本。当我从第二行删除i:type="Record" xmlns="http://schemas.datacontract.org/Storage"时（保留其余文字），我可以使用下面给出的代码获取accelx和accely值。

如何使用原始文本解析此文件？

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfRecord xmlns:i="http://www.w3.org/2001/XMLSchema-instance" i:type="Record" xmlns="http://schemas.datacontract.org/Storage">
  <AvailableCharts>
    <Accelerometer>true</Accelerometer>
    <Velocity>false</Velocity>
  </AvailableCharts>
  <Trics>
    <Trick>
      <EndOffset>PT2M21.835S</EndOffset>
      <Values>
        <TrickValue>
          <Acceleration>26.505801694441629</Acceleration>
          <Rotation>0.023379150593228679</Rotation>
        </TrickValue>
      </Values>
    </Trick>
  </Trics>
  <Values>
    <SensorValue>
      <accelx>-3.593643144</accelx>
      <accely>7.316485176</accely>
    </SensorValue>
    <SensorValue>
      <accelx>0.31103436</accelx>
      <accely>7.70408184</accely>
    </SensorValue>
  </Values>
</ArrayOfRecord>

解析数据的代码：

import lxml.etree as etree
tree = etree.parse(r"C:\testdel.xml")
root = tree.getroot()

val_of_interest = root.findall('./Values/SensorValue')

for sensor_val in val_of_interest:
    print sensor_val.find('accelx').text
    print sensor_val.find('accely').text

我在这里问了相关问题：How to extract data from xml file that is deep down the tag

谢谢

Answer 1

混淆是由以下默认命名空间（声明没有前缀的命名空间）引起的：

xmlns="http://schemas.datacontract.org/Storage"

请注意，没有前缀的后代元素会隐式地从祖先继承默认命名空间。现在，要引用命名空间中的元素，您需要将前缀映射到命名空间URI，并在XPath中使用该前缀：

ns = {'d': 'http://schemas.datacontract.org/Storage' }
val_of_interest = root.findall('./d:Values/d:SensorValue', ns)

for sensor_val in val_of_interest:
    print sensor_val.find('d:accelx', ns).text
    print sensor_val.find('d:accely', ns).text

如何在解析XML文件时处理xmlns值？

1 个答案: