我有以下类型的数据以xml格式返回给我(返回了很多房间;这是我回来的数据的一个例子):
<?xml version="1.0" encoding="UTF-8"?>
<rooms>
<total-results>1</total-results>
<items-per-page>1</items-per-page>
<start-index>0</start-index>
<room>
<id>xxxxxxxx</id>
<etag>5</etag>
<link rel="http://schemas.com.mysite.building" title="building" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/buildings/yyyyyyyyy"/>
<name>1.306</name>
<status>active</status>
<link rel="self" title="self" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/rooms/aaaaaaaaa">
</room>
</rooms>
如果nodeType == node.TEXT_NODE,我似乎能够访问数据(所以我可以看到我有1.306房间)。此外,我似乎能够访问nodeName 链接,但我真的需要知道该房间是否在我可接受的建筑物之一,所以我需要能够到达其余部分看看yyyyyyyyy。有人可以建议吗?
好的,@ vezult,正如你的建议,这是我最终想到的(工作代码!)使用ElementTree。这可能不是最pythonic(或ElementTree-ic?)这样做的方式,但似乎有效。我很高兴能够访问我的xml的每一部分的.tag,.attrib和.text。我欢迎任何有关如何使其变得更好的建议。
# We start out knowing our room name and our building id. However, the same room can exist in many buildings.
# Examine the rooms we've received and get the id of the one with our name that is also in our building.
# Query the API for a list of rooms, getting u back.
request = build_request(resourceUrl)
u = urllib2.urlopen(request.to_url())
mydata = u.read()
root = ElementTree.fromstring(mydata)
print 'tree root', root.tag, root.attrib, root.text
for child in root:
if child.tag == 'room':
for child2 in child:
# the id tag comes before the name tag, so hold on to it
if child2.tag == "id":
hold_id = child2.text
# the building link comes before the room name, so hold on to it
if child2.tag == 'link': # if this is a link
if "building" in child2.attrib['href']: # and it's a building link
hold_link_data = child2.attrib['href']
if child2.tag == 'name':
if (out_bldg in hold_link_data and # the building link we're looking at has our building in it
(in_rm == child2.text)): # and this room name is our room name
out_rm = hold_id
break # get out of for-loop
答案 0 :(得分:3)
您没有提供您正在使用的库的指示,因此我假设您使用的是标准python ElementTree
模块。在这种情况下,请执行以下操作:
from xml.etree import ElementTree
tree = ElementTree.fromstring("""<?xml version="1.0" encoding="UTF-8"?>
<rooms>
<total-results>1</total-results>
<items-per-page>1</items-per-page>
<start-index>0</start-index>
<room>
<id>xxxxxxxx</id>
<etag>5</etag>
<link rel="http://schemas.com.mysite.building" title="building" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/buildings/yyyyyyyyy" />
<name>1.306</name>
<status>active</status>
<link rel="self" title="self" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/rooms/aaaaaaaaa" />
</room>
</rooms>
""")
# Select the first link element in the example XML
for node in tree.findall('./room/link[@title="building"]'):
# the 'attrib' attribute is a dictionary containing the node attributes
print node.attrib['href']