我正在尝试阅读此url并尝试在此标记之间提取信息:" identificationInfo"
但是,当我使用此代码时:
import requests
import xml.etree.ElementTree as ET
url = "http://qldspatial.information.qld.gov.au/catalogue/rest/document?id={96BD66CE-2207-4D35-815B-0E5648C0185F}&f=xml"
response = requests.get(url)
xml_content = response.content
tree = ET.fromstring(xml_content)
for child in tree:
print(child.tag, child.attrib)
但我得到的结果并不包含标签的任何属性。
('{http://www.isotc211.org/2005/gmd}fileIdentifier', {})
('{http://www.isotc211.org/2005/gmd}language', {})
('{http://www.isotc211.org/2005/gmd}characterSet', {})
('{http://www.isotc211.org/2005/gmd}parentIdentifier', {})
('{http://www.isotc211.org/2005/gmd}hierarchyLevel', {})
('{http://www.isotc211.org/2005/gmd}contact', {})
('{http://www.isotc211.org/2005/gmd}dateStamp', {})
('{http://www.isotc211.org/2005/gmd}metadataStandardName', {})
('{http://www.isotc211.org/2005/gmd}metadataStandardVersion', {})
('{http://www.isotc211.org/2005/gmd}referenceSystemInfo', {})
('{http://www.isotc211.org/2005/gmd}identificationInfo', {})
('{http://www.isotc211.org/2005/gmd}distributionInfo', {})
('{http://www.isotc211.org/2005/gmd}dataQualityInfo', {})
('{http://www.isotc211.org/2005/gmd}metadataConstraints', {})`
我不熟悉xml,但我无法解决为什么我无法看到更多信息。我错过了一步吗?如果有人可以提供协助,我们将不胜感激。
答案 0 :(得分:1)
我正在使用minidom
而不是ElementTree
。获取所需值的代码是:
from xml.dom import minidom
import requests
url = "http://qldspatial.information.qld.gov.au/catalogue/rest/document?id={96BD66CE-2207-4D35-815B-0E5648C0185F}&f=xml"
response = requests.get(url)
xml_content = response.content
doc = minidom.parseString(xml_content)
identification = doc.getElementsByTagName("identificationInfo")[0]
date = identification.getElementsByTagName('gco:Date')[0].firstChild.nodeValue # "2014-09-05"
responsible_party = identification.getElementsByTagName('CI_ResponsibleParty')[0]
department = responsible_party.getElementsByTagName('gco:CharacterString')[0].firstChild.nodeValue # "Department of National Parks, Sport and Racing"