这是我对http
请求
<?xml version="1.0" encoding="UTF-8"?>
<Dataset name="aggregations/g/ds083.2/2/TP"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://xml.opendap.org/ns/DAP2"
xsi:schemaLocation="http://xml.opendap.org/ns/DAP2
http://xml.opendap.org/dap/dap2.xsd" >
<Attribute name="NC_GLOBAL" type="Container">
<Attribute name="Originating_or_generating_Center" type="String">
<value>US National Weather Service, National Centres for Environmental Prediction (NCEP)</value>
</Attribute>
<Attribute name="Originating_or_generating_Subcenter" type="String">
<value>0</value>
</Attribute>
<Attribute name="GRIB_table_version" type="String">
<value>2,1</value>
</Attribute>
<Attribute name="Type_of_generating_process" type="String">
<value>Forecast</value>
</Attribute>
<Attribute name="Analysis_or_forecast_generating_process_identifier_defined_by_originating_centre" type="String">
<value>Analysis from GDAS (Global Data Assimilation System)</value>
</Attribute>
<Attribute name="file_format" type="String">
<value>GRIB-2</value>
</Attribute>
<Attribute name="Conventions" type="String">
<value>CF-1.6</value>
</Attribute>
<Attribute name="history" type="String">
<value>Read using CDM IOSP GribCollection v3</value>
</Attribute>
<Attribute name="featureType" type="String">
<value>GRID</value>
</Attribute>
<Attribute name="_CoordSysBuilder" type="String">
<value>ucar.nc2.dataset.conv.CF1Convention</value>
</Attribute>
</Attribute>
<Array name="time1">
<Attribute name="units" type="String">
<value>Hour since 2007-12-06T12:00:00Z</value>
</Attribute>
<Attribute name="standard_name" type="String">
<value>time</value>
</Attribute>
<Attribute name="long_name" type="String">
<value>GRIB forecast or observation time</value>
</Attribute>
<Attribute name="calendar" type="String">
<value>proleptic_gregorian</value>
</Attribute>
<Attribute name="_CoordinateAxisType" type="String">
<value>Time</value>
</Attribute>
<Float64/>
<dimension name="time1" size="10380"/>
</Array>
</Dataset>
我正在尝试使用Python 3.5解析此XML内容
from xml.etree import ElementTree
response = requests.get("http://rda.ucar.edu/thredds/dodsC/aggregations/g/ds083.2/2/TP.ddx?time1")
tree = ElementTree.fromstring(response.content)
attr = tree.find("Attribute")
print(attr)
当我打印这个时,我得到一个None
。我究竟做错了什么?我还想访问“Array”标签,但也返回None
。
答案 0 :(得分:2)
正如the doc中所述,由于数据集根标记的xmlns="http://xml.opendap.org/ns/DAP2"
属性,您要查找的所有标记名称都必须以{http://xml.opendap.org/ns/DAP2}
为前缀。
# should find something
tree.find("{http://xml.opendap.org/ns/DAP2}Attribute")
阅读ElementTree文档的这一部分还将向您展示如何使用命名空间的名称来使其更具可读性。
答案 1 :(得分:1)
XML文档使用命名空间,因此您需要在代码中支持它。 etree
documentation中有一个解释和示例代码。
基本上你可以这样做:
import requests
from xml.etree import ElementTree
response = requests.get('http://rda.ucar.edu/thredds/dodsC/aggregations/g/ds083.2/2/TP.ddx?time1')
tree = ElementTree.fromstring(response.content)
attr = tree.find("{http://xml.opendap.org/ns/DAP2}Attribute")
>>> print(attr)
<Element '{http://xml.opendap.org/ns/DAP2}Attribute' at 0x7f147a292458>
# or declare the namespace like this
ns = {'dap2': 'http://xml.opendap.org/ns/DAP2'}
attr = tree.find("dap2:Attribute", ns)
>>> print(attr)
<Element '{http://xml.opendap.org/ns/DAP2}Attribute' at 0x7f147a292458>