我试图获取给定节点(SCHOOL)的元素(GRADE)值的计数(基于下面的示例,结果将是:GR12 = 2,GR10 = 1,GR9 = 4,GR11 = 1):
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns1:SchoolUpload xmlns:ns1="http://abcsite.ca">
<ns1:School>
<ns1:SchoolID>123456</ns1:SchoolID>
<ns1:Students>
<ns1:Student>
<ns1:ID>1</ns1:ID><ns1:Grade>GR12</ns1:Grade><ns1:Name>A. Green</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>2</ns1:ID><Grade>GR9</ns1:Grade><ns1:Name>B. Green</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>3</ns1:ID><Grade>GR12</ns1:Grade><ns1:Name>A. Blue</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>4</ns1:ID><Grade>GR9</ns1:Grade><ns1:Name>B. Blue</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>5</ns1:ID><Grade>GR11</ns1:Grade><ns1:Name>C. Blue</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>6</ns1:ID><Grade>GR9</ns1:Grade><ns1:Name>A. Redd</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>7</ns1:ID><Grade>GR9</ns1:Grade><ns1:Name>B. Redd</ns1:Name>
</ns1:Student>
<ns1:Student>
<ns1:ID>8</ns1:ID><ns1:Grade>GR10</ns1:Grade><ns1:Name>C. Redd</ns1:Name>
</ns1:Student>
</ns1:Students>
<ns1:School>
</ns1:SchoolUpload>
我的解决方案遍历每个SCHOOL,搜索/创建每个GRADE属性值的列表,然后使用 len()函数获取每个GRADE列表的元素计数:
school_list = root.findall('.//{http://abcsite.ca}School') #Get list of schools
for school in school_list:
gr9 = school.findall("{http://abcsite.ca}Students/Student/*[@{http://abcsite.ca}Grade='GR9']")
gr10 = school.findall("{http://abcsite.ca}Students/Student/*[@{http://abcsite.ca}Grade='GR10']")
gr11 = school.findall("{http://abcsite.ca}Students/Student/*[@{http://abcsite.ca}Grade='GR11']")
gr12 = school.findall("{http://abcsite.ca}Students/Student/*[@{http://abcsite.ca}Grade='GR12']")
print(len(gr9))
print(len(gr10))
print(len(gr11))
print(len(gr12))
但是, school.findall()函数调用找不到指定的属性值,因此不返回列表。我只是在学习Python(通过https://docs.python.org/3.6/library/xml.etree.elementtree.html网站) 我整天都在尝试不同的想法,我认为这会奏效,但我无法弄清楚。任何建议/帮助将非常感激(同样,如果有一个更优雅的解决方案,我都是耳朵)。
---编辑:代码在下面的评论中修改了建议
import xml.etree.ElementTree as ET
def main():
ns = { 'ns1' : '{http://ontario.ca}' }
school_file = 'c://Users/dperry2/Desktop/python/schools.XML'
tree = ET.parse(school_file)
root = tree.getroot()
#//I attempted to use the namespace technique with the school list(below), and although it doesn't error, it didn't return anything; school_list was empty?!?!?
#school_list = root.findall('.//ns1:School') #, ns)
school_list = root.findall('.//{http://ontario.ca}School')
for school in school_list:
gr9 = school.findall("ns1:Students/ns1:Student/ns1:Grade[.='GR9']", ns)
print(len(gr9))
main()
答案 0 :(得分:1)
Grade
是XML元素,不是属性。在XPath中,@
用于引用XML attribute,而您在此处未阅读任何XML属性:
ns = { 'ns1' : 'http://abcsite.ca' }
school_list = root.findall('.//ns1:School', namespaces=ns) #Get list of schools
for school in school_list:
gr9 = school.findall("ns1:Students/ns1:Student[ns1:Grade='GR9']/ns1:Grade", namespaces=ns)
....
print len(gr9)
....
由于您在代码中多次引用了前缀元素,因此使用字典会更方便,如上所示。使用lxml
,您可以使用xml.etree
不支持的更惯用的XPath,因为xml.etree
仅支持XPath 1.0的有限子集:
gr9 = school.findall("ns1:Students/ns1:Student/ns1:Grade[.='GR9']", namespaces=ns)
请注意,.
是对当前上下文节点的引用,在上面的情况下是ns1:Grade
元素。