我尝试了很多方法,但我仍然无法从中提取数据。
<?xml version="1.0" encoding="UTF-8"?><cwbopendata xmlns="urn:cwb:gov:tw:cwbcommon:0.1">
<identifier>CWB_ANNUAL_DATA_20161017134902</identifier>
<sender>weather@cwb.gov.tw</sender>
<sent>2016-10-17 13:51+08:00</sent>
<status>Actual</status>
<msgType>Issue</msgType>
<dataid>CWB_B0024-002</dataid>
<scope>Public</scope>
<dataset>
<location>
<locationName>BANQIAO,板橋</locationName>
<stationId>466880</stationId>
<weatherElement>
<elementName>逐時觀測</elementName>
<time>
<obsTime>2015-10-17 01:00</obsTime>
<weatherElement>
<elementName>測站氣壓</elementName>
<elementValue>
<value>1012.9</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>溫度</elementName>
<elementValue>
<value>23.2</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>相對濕度</elementName>
<elementValue>
<value>68</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風速</elementName>
<elementValue>
<value>3.9</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風向</elementName>
<elementValue>
<value>東北東,ENE</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>降水量</elementName>
<elementValue>
<value>0.0</value>
</elementValue>
</weatherElement>
</time>
<time>
<obsTime>2015-10-17 02:00</obsTime>
<weatherElement>
<elementName>測站氣壓</elementName>
<elementValue>
<value>1012.7</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>溫度</elementName>
<elementValue>
<value>22.9</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>相對濕度</elementName>
<elementValue>
<value>69</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風速</elementName>
<elementValue>
<value>3.3</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風向</elementName>
<elementValue>
<value>東北東,ENE</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>降水量</elementName>
<elementValue>
<value>0.0</value>
</elementValue>
</weatherElement>
</time>
<time>
<obsTime>2015-10-17 03:00</obsTime>
<weatherElement>
<elementName>測站氣壓</elementName>
<elementValue>
<value>1012.5</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>溫度</elementName>
<elementValue>
<value>22.8</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>相對濕度</elementName>
<elementValue>
<value>70</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風速</elementName>
<elementValue>
<value>3.7</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風向</elementName>
<elementValue>
<value>東北東,ENE</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>降水量</elementName>
<elementValue>
<value>0.0</value>
</elementValue>
</weatherElement>
</time>
<time>
<obsTime>2015-10-17 04:00</obsTime>
<weatherElement>
<elementName>測站氣壓</elementName>
<elementValue>
<value>1012.4</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>溫度</elementName>
<elementValue>
<value>22.7</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>相對濕度</elementName>
<elementValue>
<value>70</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風速</elementName>
<elementValue>
<value>3.1</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風向</elementName>
<elementValue>
<value>東北東,ENE</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>降水量</elementName>
<elementValue>
<value>0.0</value>
</elementValue>
</weatherElement>
</time>
<time>
<obsTime>2015-10-17 05:00</obsTime>
<weatherElement>
<elementName>測站氣壓</elementName>
<elementValue>
<value>1012.6</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>溫度</elementName>
<elementValue>
<value>22.6</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>相對濕度</elementName>
<elementValue>
<value>71</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風速</elementName>
<elementValue>
<value>2.2</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>風向</elementName>
<elementValue>
<value>東北東,ENE</value>
</elementValue>
</weatherElement>
<weatherElement>
<elementName>降水量</elementName>
<elementValue>
<value>0.0</value>
</elementValue>
</weatherElement>
</time>
<time>
这是我的代码试图从中提取数据。
from lxml import objectify
path=r'C:\Users\champion\Desktop\data_science_race\weather\C-B0024-002.xml'
parsed=objectify.parse(open(path,'rb'))
root=parsed.getroot()
此部分从location和stationId成功提取数据。
data=[]
for elt in root.dataset.location:
el_data={}
skip_fields=['{urn:cwb:gov:tw:cwbcommon:0.1}weatherElement']
for child in elt.getchildren():
if child.tag in skip_fields:
continue
el_data[child.tag]=child.text
data.append(el_data)
此部分可以提取obsTime,但无法提取elmentName和elementValue。
data=[]
for elt in root.dataset.location.weatherElement.time:
el_data={}
skip_field=['{urn:cwb:gov:tw:cwbcommon:0.1}time']
for child in elt.getchildren():
if child.tag in skip_field:
continue
el_data[child.tag]=child.text
for descendent in child.getchildren():
el_data[descendent.tag]=descendent.text
for next_descendent in descendent.getchildren():
el_data[next_descendent.tag]=next_descendent.text
data.append(el_data)
答案 0 :(得分:0)
我建议使用pyxb进行此类任务。
您想要做的是:
python pyxbgen cwbopendata.xsd -m cwbopendata
结果代码很干净。例如,下面的示例打印了您的天气观测的时间戳:
import cwbopendata
import pyxb
data = cwbopendata.CreateFromDocument(open('cwb_data_example.xml').read())
for t in data.dataset.location.weatherElement.time:
print(t.obsTime)