Question

我正在尝试获取XML文件的特定部分，并将其移入pandas数据框。遵循xml.etree的一些教程之后，我仍然停留在获取输出上。到目前为止，我已经设法找到了子节点，但是我无法访问它们（即无法从中获取实际数据）。所以，这就是我到目前为止所得到的。

"scope": "javascript,jsx,jsx-attr"

我想要的是从节点tree=ET.parse('data.xml') root=tree_edu.getroot() root.tag #find all nodes within xml data tree_edu.findall(".//") #access the node tree.findall(".//{http://someUrl.nl/schema/enterprise/program}programSummaryText")（特别是子节点programDescriptions）中获取数据，当然还有一些。但是首先要关注这个。

一些数据可以使用：

programDescriptionText xml:lang="nl"

Answer 1

请尝试以下代码：（55703748.xml包含您已发布的xml）

import xml.etree.ElementTree as ET

tree = ET.parse('55703748.xml')
root = tree.getroot()
nodes = root.findall(".//{http://someUrl.nl/schema/enterprise/program}programSummaryText")
for node in nodes:
    print(node.text)

输出

short Program Course Name summary

将深度嵌套的XML解析为pandas数据框

1 个答案: