Question

我正在尝试阅读此url并尝试在此标记之间提取信息：＆＃34; identificationInfo＆＃34;

但是，当我使用此代码时：

import requests
import xml.etree.ElementTree as ET

url = "http://qldspatial.information.qld.gov.au/catalogue/rest/document?id={96BD66CE-2207-4D35-815B-0E5648C0185F}&f=xml"

response = requests.get(url)

xml_content = response.content

tree = ET.fromstring(xml_content)

for child in tree:

    print(child.tag, child.attrib)

但我得到的结果并不包含标签的任何属性。

('{http://www.isotc211.org/2005/gmd}fileIdentifier', {})
('{http://www.isotc211.org/2005/gmd}language', {})
('{http://www.isotc211.org/2005/gmd}characterSet', {})
('{http://www.isotc211.org/2005/gmd}parentIdentifier', {})
('{http://www.isotc211.org/2005/gmd}hierarchyLevel', {})
('{http://www.isotc211.org/2005/gmd}contact', {})
('{http://www.isotc211.org/2005/gmd}dateStamp', {})
('{http://www.isotc211.org/2005/gmd}metadataStandardName', {})
('{http://www.isotc211.org/2005/gmd}metadataStandardVersion', {})
('{http://www.isotc211.org/2005/gmd}referenceSystemInfo', {})
('{http://www.isotc211.org/2005/gmd}identificationInfo', {})
('{http://www.isotc211.org/2005/gmd}distributionInfo', {})
('{http://www.isotc211.org/2005/gmd}dataQualityInfo', {})
('{http://www.isotc211.org/2005/gmd}metadataConstraints', {})`

我不熟悉xml，但我无法解决为什么我无法看到更多信息。我错过了一步吗？如果有人可以提供协助，我们将不胜感激。

Answer 1

我正在使用minidom而不是ElementTree。获取所需值的代码是：

from xml.dom import minidom
import requests

url = "http://qldspatial.information.qld.gov.au/catalogue/rest/document?id={96BD66CE-2207-4D35-815B-0E5648C0185F}&f=xml"

response = requests.get(url)
xml_content = response.content
doc = minidom.parseString(xml_content)
identification = doc.getElementsByTagName("identificationInfo")[0]
date = identification.getElementsByTagName('gco:Date')[0].firstChild.nodeValue # "2014-09-05"
responsible_party = identification.getElementsByTagName('CI_ResponsibleParty')[0]
department = responsible_party.getElementsByTagName('gco:CharacterString')[0].firstChild.nodeValue # "Department of National Parks, Sport and Racing"

使用python时缺少xml中的属性

1 个答案: